1. Installation
To use oagdedupe, first install it using pip:
pip install oagdedupe
1.1. .dedupe
Create a cache folder to store postgres and labelstudio data.
`
mkdir .dedupe
`
1.2. docker
We use docker to run postgres and label-studio. This is the msot convenient, but it’s not necessary and they can be installed however way you want. The only requirement is that the running postgres database have plpython3 installed.
If you do not already have docker installed, see: https://docs.docker.com/get-started/#download-and-install-docker
1.3. postgres
Start postgres using docker, updating [POSTGRES_PORT] to the port on your host machine
docker run --rm -dp [POSTGRES_PORT]:5432 \
--name oagdedupe-postgres \
--env POSTGRES_USER=username \
--env POSTGRES_PASSWORD=password \
--env POSTGRES_DB=db \
--env PGDATA=/var/lib/pgsql/data/pgdata \
-v "`pwd`/.dedupe:/var/lib/pgsql/data" \
chansoosong/oagdedupe-postgres
1.4. label-studio
Start label-studio using docker, updating [LS_PORT] to the port on your host machine
docker run --rm -it -dp `[LS_PORT]`:8080 \
--name oagdedupe-labelstudio \
--env LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true \
--env LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/files \
-v "`pwd`/.dedupe:/label-studio/data" \
-v "`pwd`/.dedupe:/label-studio/files" \
heartexlabs/label-studio:latest label-studio