A CLI command to find duplicate values in a MongoDB collection.

Frédéric G. MARAND b1cc869e34 Initial commit: just a tiny script. преди 4 месеца
.gitignore b1cc869e34 Initial commit: just a tiny script. преди 4 месеца
LICENSE ff7bcced31 Initial commit преди 4 месеца
Makefile b1cc869e34 Initial commit: just a tiny script. преди 4 месеца
README.md b1cc869e34 Initial commit: just a tiny script. преди 4 месеца
example.env b1cc869e34 Initial commit: just a tiny script. преди 4 месеца
go.mod b1cc869e34 Initial commit: just a tiny script. преди 4 месеца
go.sum b1cc869e34 Initial commit: just a tiny script. преди 4 месеца
main.go b1cc869e34 Initial commit: just a tiny script. преди 4 месеца

README.md

mongodb_duplicates

Install

  • Prerequisites:
    • Go SDK 1.22+
    • The URL and credentials to a working MongoDB server
  • Optional: make lint to verify code
  • make
  • make install

Use

A CLI command to find duplicate values in a MongoDB collection.

  1. In your working directory, create a .env file based on the example provided in https://code.osinet.fr/fgm/mongodb_duplicates/src/master/example.env
  2. Install the envrun command to load environment variables from that file:
    • go install github.com/fgm/envrun@latest
  3. Adjuster your .env. For the first check, configure it for an empty collection in an empty database:
    • MONGODB_URL: default = mongodb://localhost:27017
    • MONGODB_DB: default = test
    • MONGODB_COLLECTION: default = test
    • MONGODB_FIELD: the duplicate field. Default = email
  4. Run the command in read-only mode
    • $ envrun go run ./docs/osinet/cmd/duplicates
    • it should not display anything since the collection is empty
  5. Re-run the command with seed generation. The seed data will remain in the collection.
    • $ envrun go run ./docs/osinet/cmd/duplicates -command seed
    • user1@example.com: 3
    • user2@example.com: 2
    • Resultat show show 3 duplicates of user1 and 2 of user2.
  6. Now adjust configuration for the actual collection you want to check.
  7. Run the command in read-only mode
    • $ envrun go run ./docs/osinet/cmd/duplicates
    • It will give you the values of the field for which duplicates exist, and the document count for that value

License

Licensed under the Apache 2.0 license.