A CLI command to find duplicate values in a MongoDB collection.

Frédéric G. MARAND 802611a515 Just a tiny script. 6 months ago
.gitignore 802611a515 Just a tiny script. 6 months ago
LICENSE ff7bcced31 Initial commit 6 months ago
Makefile 802611a515 Just a tiny script. 6 months ago
README.md 802611a515 Just a tiny script. 6 months ago
example.env 802611a515 Just a tiny script. 6 months ago
go.mod 802611a515 Just a tiny script. 6 months ago
go.sum 802611a515 Just a tiny script. 6 months ago
main.go 802611a515 Just a tiny script. 6 months ago

README.md

mongodb_duplicates

Install

  • Prerequisites:
    • Go SDK 1.22+
    • The URL and credentials to a working MongoDB server
  • Optional: make lint to verify code
  • make
  • make install

Use

A CLI command to find duplicate values in a MongoDB collection.

  1. In your working directory, create a .env file based on the example provided in example.env
  2. Install the envrun command to load environment variables from that file:
    • go install github.com/fgm/envrun@latest
  3. Adjuster your .env. For the first check, configure it for an empty collection in an empty database:
    • MONGODB_URL: default = mongodb://localhost:27017
    • MONGODB_DB: default = test
    • MONGODB_COLLECTION: default = test
    • MONGODB_FIELD: the duplicate field. Default = email
  4. Run the command in read-only mode
    • $ envrun go run ./docs/osinet/cmd/duplicates
    • it should not display anything since the collection is empty
  5. Re-run the command with seed generation. The seed data will remain in the collection.
    • $ envrun go run ./docs/osinet/cmd/duplicates -command seed
    • user1@example.com: 3
    • user2@example.com: 2
    • Resultat show show 3 duplicates of user1 and 2 of user2.
  6. Now adjust configuration for the actual collection you want to check.
  7. Run the command in read-only mode
    • $ envrun go run ./docs/osinet/cmd/duplicates
    • It will give you the values of the field for which duplicates exist, and the document count for that value

License

Licensed under the Apache 2.0 license.