PDF Privacy: What Happens to Files on Free Online Tools
"Your files are deleted within one hour." It is a common promise from free online PDF tools. But between the moment you upload a file and the moment it is deleted, your document sits on infrastructure owned by a company you have never heard of, subject to policies written by their lawyers and enforced — or not — by their engineers. Here is what that actually means.
The infrastructure behind "free"
Running a popular online PDF tool requires substantial infrastructure. Processing PDFs — merging, compressing, converting — requires CPU time. Storing files temporarily requires disk space. Serving millions of users requires bandwidth. None of this is free.
Most free tools run on cloud infrastructure: Amazon Web Services, Google Cloud Platform, or Microsoft Azure. Your uploaded PDF is typically written to an object storage bucket (S3, GCS, Azure Blob Storage) in one or more geographic regions. Which regions? That depends on the tool's configuration — and unless the privacy policy specifies, you do not know.
For users in the EU, this matters legally. GDPR restricts the transfer of personal data to countries outside the EEA without appropriate safeguards. If your PDF contains personal data (a contract with names and addresses, a medical document, an employment record) and it is processed on a US server, there is a potential GDPR compliance issue — regardless of whether the tool claims GDPR compliance on its homepage.
Analysing what major PDF tools actually say
We reviewed the terms of service and privacy policies of several major free PDF tools. What we found:
File retention windows
Most policies state a retention window of one to a few hours. "Files are deleted automatically after X hours" is standard language. However:
- The clock typically starts from when processing completes, not when you upload. If the tool is under heavy load, processing may be delayed.
- Backup systems may retain copies longer than the stated window. Automated backups of storage buckets are common practice and are not always addressed in user-facing policies.
- Log data (file names, sizes, IP addresses, timestamps) is typically retained much longer than the files themselves — often 30–90 days for operational logs and indefinitely in some anonymised analytics systems.
Improvement clauses
"We may use information about how you use our service to improve our products" appears in most policies. In some cases, this clause is broad enough to cover using uploaded files to train or evaluate document processing models. "Anonymised" data from PDF operations — document structure, file size distributions, content type patterns — can be extracted without reading your document's content. Whether tools do this in practice is not disclosed.
Third-party access
Every policy we reviewed names multiple categories of third parties that may receive user data:
- Cloud infrastructure providers (AWS, GCP, Azure) who physically host the files
- Analytics providers (Google Analytics, Amplitude) who receive usage data
- Payment processors if there is a paid tier
- Legal authorities "when required by law" — a clause that in some jurisdictions can be triggered by requests that do not require a court order
Acquisition and business change language
Most policies include a clause permitting data transfer in the event of a merger, acquisition, or sale of assets. If a PDF tool company is acquired, your upload history — and potentially retained files — may transfer to the acquirer. The acquirer's privacy policy then applies, which you have not agreed to.
What the Network tab reveals
We tested several popular free PDF tools with a clearly labelled test document and monitored network traffic. What we found was illuminating:
- Files are uploaded before the operation runs. In every server-based tool, the file is transmitted to the server the moment you trigger the operation — or in some cases, before you have confirmed the operation parameters.
- Analytics events fire throughout. Events tracking user behaviour — which page, what action, how long it took — fire continuously to third-party analytics services. These events include session identifiers that can be used to link upload events to later visits.
- Some tools make multiple outbound calls to different domains. A single file operation can involve the tool's primary servers, a CDN, an analytics provider, and an ad network, each receiving some signal about the operation.
None of this is illegal, and in most cases it is disclosed somewhere in the policy documents. But it is very different from the "private and secure" marketing language that most tools use.
The specific risk categories
Not all PDFs carry the same risk. The following categories warrant particular caution:
- Documents containing personal data: anything with names, addresses, identification numbers, financial account details, or health information is potentially subject to data protection law. Processing these on a third-party server creates compliance obligations and liability.
- Legal documents under privilege: contracts, attorney-client communications, litigation materials. Some jurisdictions hold that transmitting privileged documents to a third party waives the privilege.
- Trade secrets and proprietary information: confidential business plans, unreleased product specifications, internal financial forecasts. These are often covered by NDAs and internal information security policies.
- Documents from clients or patients: if you work with client documents or patient records, your obligations to those individuals may prevent you from uploading their documents to third-party tools without disclosure and consent.
- Government and defence documents: classified or sensitive government documents should never go near a free online tool. Most organisations that handle such documents have explicit policies prohibiting this.
The structural solution: browser-based processing
The alternative to trusting a third party with your files is processing them in a context where no third party has access: your own browser, on your own device.
Browser-based tools like those on keptlocal process files using code that runs inside the browser tab. The file bytes load from your disk into your device's RAM. The operation runs using your CPU. The output downloads from RAM to your disk. At no point is a network request made to any external server.
This is verifiable. Open the Network tab in DevTools, run a file operation, and watch. No upload request appears. The file stays local from start to finish.
The limitation is real: browser-based tools cannot do everything a server can. Extremely large files, complex format conversions, and batch automation are areas where server-side processing is still necessary. For those cases, choose a tool with explicit data handling commitments — and ideally one that can run on your own infrastructure.
Practical guidance
- Before uploading any document: ask whether it contains personal data, is subject to confidentiality obligations, or belongs to a client or patient. If any of these apply, use a local tool.
- Read the retention policy before trusting it: "deleted after one hour" means little if backup retention is indefinite. Look for language about backup systems and operational logs, not just the primary file store.
- Check where the servers are: for EU users, look for explicit EU data residency commitments. "GDPR compliant" without EU data residency is a weaker protection than it sounds.
- Verify with the Network tab: for any tool claiming to be local, check that no outbound requests are made during the file operation. This takes thirty seconds and tells you the truth.
- Use a dedicated solution for high-sensitivity documents: organisations that regularly handle sensitive PDFs should evaluate self-hosted document tools, enterprise PDF software with explicit data handling agreements, or browser-based tools where the processing is genuinely local.
keptlocal processes all files locally in your browser. Browse the tools — and verify with DevTools that nothing leaves your device.