A CLI command to find duplicate values in a MongoDB collection.

Frédéric G. MARAND 202dcf46e5 Just a tiny script. hace 4 meses
.gitignore 202dcf46e5 Just a tiny script. hace 4 meses
LICENSE ff7bcced31 Initial commit hace 4 meses
Makefile 202dcf46e5 Just a tiny script. hace 4 meses
README.md 202dcf46e5 Just a tiny script. hace 4 meses
example.env 202dcf46e5 Just a tiny script. hace 4 meses
go.mod 202dcf46e5 Just a tiny script. hace 4 meses
go.sum 202dcf46e5 Just a tiny script. hace 4 meses
main.go 202dcf46e5 Just a tiny script. hace 4 meses

README.md

mongodb_duplicates

Install

  • Prerequisites:
    • Go SDK 1.22+
    • The URL and credentials to a working MongoDB server
  • Optional: make lint to verify code
  • make
  • make install

Use

A CLI command to find duplicate values in a MongoDB collection.

  1. In your working directory, create a .env file based on the example provided in https://code.osinet.fr/fgm/mongodb_duplicates/src/master/example.env
  2. Install the envrun command to load environment variables from that file:
    • go install github.com/fgm/envrun@latest
  3. Adjuster your .env. For the first check, configure it for an empty collection in an empty database:
    • MONGODB_URL: default = mongodb://localhost:27017
    • MONGODB_DB: default = test
    • MONGODB_COLLECTION: default = test
    • MONGODB_FIELD: the duplicate field. Default = email
  4. Run the command in read-only mode
    • $ envrun go run ./docs/osinet/cmd/duplicates
    • it should not display anything since the collection is empty
  5. Re-run the command with seed generation. The seed data will remain in the collection.
    • $ envrun go run ./docs/osinet/cmd/duplicates -command seed
    • user1@example.com: 3
    • user2@example.com: 2
    • Resultat show show 3 duplicates of user1 and 2 of user2.
  6. Now adjust configuration for the actual collection you want to check.
  7. Run the command in read-only mode
    • $ envrun go run ./docs/osinet/cmd/duplicates
    • It will give you the values of the field for which duplicates exist, and the document count for that value

License

Licensed under the Apache 2.0 license.