HR Blog | Advice for Hiring Success

Developer Resources

Remove PII from documents: Anonymiser API for developers

Enhance your data processing with our Anonymiser API - automatically detect and remove PII from text and PDF documents. A secure anonymisation tool for developers seeking data privacy solutions.
By Dionysis Kastellanis, Founder & Technical Lead at Hello Radius
Updated: 18 November 2024
2 min read
Hello Radius logo, representing AI-powered recruitment and applicant tracking technology.

The critical need for data anonymisation in AI applications

As AI technologies rapidly evolve, developers face an increasing challenge: anonymising data before it reaches large language models (LLMs). Personal Identifiable Information (PII) poses significant risks in data processing, making anonymisation a critical requirement for responsible AI development.

Understanding PII removal

What is Text Anonymisation?

Text anonymisation is the process of removing or masking personal information from documents, enabling secure AI data preprocessing while maintaining document context and usability.

Why Remove PII from Text?

  • Protect individual privacy
  • Ensure regulatory compliance
  • Reduce data breach risks
  • Enable safe AI model training

 

Challenges in secure document redaction

Identifying all forms of personal information across diverse document types requires advanced machine learning algorithms, contextual understanding and continuous algorithm refinement.

Key Challenges:

  • Preserving document semantic meaning
  • Detecting subtle personal identifiers
  • Maintaining data utility
  • Ensuring consistent redaction

 

Anonymiser API: secure PII removal solution

Our API automatically removes PII from text and PDF documents whilst enabling you to customise the level of redaction to match your needs.

Key API features:

  • Advanced PII Detection: Automatically identifies and redacts sensitive information.
  • Multi-Format Support: Processes both plain text and PDF files.
  • Flexible Redaction Options:
    1. Standard Mode: Uses consistent placeholders.
    2. Unique Mode: Generates unique identifiers for potential reverse-mapping.

Key considerations:

  • Preserves document context
  • Maintains data usability
  • Ensures consistent anonymisation

Security Guarantees: No data storage and immediate document processing.

 

How to use the API

API endpoint

Send a POST request (form-data) to the API endpoint https://api.helloradius.com/v1/anonymiser with either a PDF file or a as a body parameter: text.

Authentication

The API requires an api-key header for authentication, ensuring controlled access during our early-access phase. API-KEY: "linkedin-early-access-api-key-2024".

Test the API

Open your terminal and use curl:

curl --location 'https://api.helloradius.com/v1/anonymiser' --header 'api-key: linkedin-early-access-api-key-2024' --form 'text="John Doe is a software engineer based in San Francisco, with a degree from Stanford University and over 10 years of experience in fintech and AI. He led the development of a major platform at XYZ Corp from 2018 to 2021 and enjoys mentoring, hackathons, and outdoor activities near Lake Tahoe. Born on June 15, 1990, in Chicago, he speaks both English and Spanish."' --form 'redaction="standard"'

POSTMAN:

Here’s our API dev kit Postman collection to help you set up and test the Anonymiser API with ease.

 

Our commitment to privacy-first AI

Initially developed for Hello Radius, this API represents our commitment to responsible AI development and can be integrated into various scenarios where sensitive data is processed.

Join the community

We're actively seeking developer feedback to improve the Anonymiser API. Share your experiences and report issues to support@helloradius.com.

Help us build a more privacy-conscious AI ecosystem!