Products Technologies Demo Docs Blog Support Company

Automatically Mapping TX Text Control Form Fields to JSON Data in .NET C#

In this article, we will explore how to automatically map TX Text Control form fields to JSON data in a .NET C# application. This process can help streamline data handling and improve the efficiency of your application when working with form fields in TX Text Control.

Automatically Mapping TX Text Control Form Fields to JSON Data in .NET C#

Document templates are powerful, but they often come with a very practical problem: Field names are not always consistent.

A JSON data source might contain a property named company_name, while the template uses companyname, companyName, or even name_company. In a perfect world, every template and every data source would share the same naming convention. In real projects, templates are edited by different teams, imported from older systems, or reused across customers, departments, and products.

This project, TxFormFieldMapper, demonstrates how to automatically map and rename TX Text Control form fields based on a JSON data source, even when the names are not an exact match.

The goal is simple:

JSON field:      company_name
Template field:  companyname
Result:          companyname is renamed to company_name

After the mapping step, the template contains form field names that match the JSON data source. This makes later data population, merging, and automation much easier.

Typical Use Cases

This approach is useful whenever templates and data sources evolve independently.

Common scenarios include:

  • Customer-specific document templates that should be connected to a shared JSON schema
  • Legacy templates where form field names use older naming conventions
  • Automatically preparing templates before a merge or data-binding step
  • Reducing manual cleanup work in document automation workflows
  • Normalizing form fields across many templates
  • Mapping templates maintained by non-developers to structured API data

For example, a sales proposal template might contain fields such as:

companyname
firstName
street_address
street1

The JSON data source might contain:

{
  "companyname": "Text Control GmbH",
  "firstname": "Bjorn",
  "street": "123 Sample Street"
}

The mapper compares the template fields to the JSON fields and renames the template fields to the closest JSON field names.

Project Overview

The sample is a .NET console application using TX Text Control.

When started without parameters, it uses files from the data folder:

data\forms.tx
data\sample-data.json
data\mapped-forms.tx

The app:

  1. Opens a TX Text Control template document.
  2. Reads the available JSON field names.
  3. Iterates over the document form fields.
  4. Finds the best JSON field match for each form field.
  5. Renames the form field using FormFieldCollection.
  6. Saves the mapped template as a new TX document.

The project can also be started with explicit paths:

dotnet run -- template.tx data.json mapped-template.tx 0.72

The optional final value is the minimum matching score. The default is 0.72.

Why Fuzzy Matching Is Needed

Exact string comparison is too strict for real-world template mapping.

These names should probably match:

company_name  <->  companyname
first_name    <->  firstName
street        <->  street_address
name_company  <->  company_name

But a simple equality check would reject all of them.

TxFormFieldMapper uses a small fuzzy-matching pipeline. The idea is not to guess blindly, but to apply several simple name comparison techniques and only accept matches that are confident enough.

Extracting JSON Field Names

The JsonFieldExtractor class parses the JSON document with System.Text.Json.

For every object property, it collects both:

  • the leaf name
  • the nested path

For example:

{
  "customer": {
    "company_name": "Text Control GmbH"
  }
}

This produces:

customer
customer.company_name
company_name

Collecting both forms allows templates to use either short field names such as company_name or more explicit names such as customer.company_name.

Tokenizing Field Names

The FieldNameTokenizer class breaks names into lowercase tokens.

It understands common naming styles:

company_name     -> company, name
firstName        -> first, name
street_address   -> street, address
street1          -> street1

It also creates a normalized form by joining the tokens:

company_name -> companyname
firstName    -> firstname

This normalized form is useful because many differences are only cosmetic. For example, company_name, companyName, and companyname can all normalize to comparable strings.

Scoring Field Name Similarity

For every form field, the mapper compares it with every JSON field. It calculates several scores and uses the best one.

The current scoring techniques are:

  • normalized exact matching
  • token set matching
  • token containment matching
  • Levenshtein similarity

Normalized Exact Matching

If two fields are not textually identical but normalize to the same value, they receive a high score.

Example:

company_name <-> companyname

Both normalize to:

companyname

This receives a score of 0.99.

Token Set Matching

Token set matching compares the words inside the names.

The mapper uses a Jaccard-style comparison:

matching tokens / all unique tokens

This helps when the words are present but ordered differently.

Example:

name_company <-> company_name

Both contain:

company, name

Token Containment Matching

Sometimes one field name is more specific than the other.

Example:

street <-> street_address

The names are not identical, and the token sets are not equal. But the shorter name is fully contained in the longer one. The mapper treats this as a confident match and assigns a score of 0.86.

This is especially useful for templates where fields include suffixes or additional context:

street1
street_address
billing_street

Levenshtein Similarity

The mapper also uses Levenshtein distance to handle typo-like differences.

Example:

compnyname <-> companyname

Levenshtein distance measures how many edits are needed to transform one string into another. The mapper converts that distance into a similarity score.

This allows close misspellings to be considered without requiring a large rule set.

Choosing the Best Match

For each form field, the mapper calculates all candidate scores and chooses the JSON field with the highest score.

The final score is the best of:

normalized exact match
token set match
token containment match
Levenshtein similarity

A match is accepted only if it is above the configured threshold.

Default threshold:

0.72

This keeps the mapper from renaming fields that are only weakly related.

Handling Ambiguous Matches

The mapper also protects against ambiguous matches.

If the best match and second-best match are too close, the field is skipped. This is important because some JSON schemas contain similar field names:

company_name
contact_name
first_name
last_name

If a template field simply says name, mapping it automatically may be unsafe. Skipping ambiguous fields is usually better than silently choosing the wrong one.

Mapping Duplicate Fields

Templates often repeat the same data in more than one place.

For example, a customer address may appear on the first page and again in a footer, summary, or appendix. The same JSON field must be allowed to map to multiple form fields.

TxFormFieldMapper does not consume JSON fields after matching them.

This means mappings like this are allowed:

street_address -> street
street1        -> street

Both form fields can be renamed to the same JSON field name.

Updating the TX Text Control Document

The TX Text Control integration is intentionally straightforward.

The app loads the template:

using var textControl = new ServerTextControl();
textControl.Create();
textControl.Load(templatePath, StreamType.InternalUnicodeFormat);

Then it passes the form field collection to the mapper:

var mapper = new FormFieldMapper(jsonFieldNames, minimumScore);
var results = mapper.RenameFormFields(textControl.FormFields);

When a match is accepted, the form field is renamed:

field.Name = decision.JsonName;

Finally, the mapped document is saved:

textControl.Save(outputPath, StreamType.InternalUnicodeFormat);

The result is a new TX document whose form fields are aligned with the JSON data source.

Example Output

A sample run might produce output like this:

renamed: 'company_name' -> 'companyname' (0.99)
renamed: 'firstName' -> 'firstname' (0.99)
renamed: 'street_address' -> 'street' (0.86)
renamed: 'street1' -> 'street' (0.86)

Saved mapped template: data\mapped-forms.tx

This output shows which fields were renamed and the confidence score for each match.

Project Structure

The sample separates the matching logic into focused classes:

Class Responsibility
Program.cs Application flow
AppOptions.cs Command-line and default path handling
JsonFieldExtractor.cs JSON field extraction
FormFieldMapper.cs TX form field renaming
FieldNameTokenizer.cs Tokenization and normalization
FieldNameScorer.cs Similarity scoring
FieldNameCandidate.cs Prepared field-name representation
MapResult.cs Mapping result output
MatchDecision.cs Match decision state

This makes it easy to adjust the matching behavior without touching the TX Text Control loading and saving code.

Conclusion

TxFormFieldMapper shows how a small amount of fuzzy matching can make document template automation much more forgiving.

Instead of requiring every form field to be named perfectly, the mapper compares field names in a way that matches how people actually name things: with different separators, casing, word order, and occasional extra context.

By combining JSON field extraction, token-based comparison, Levenshtein similarity, thresholding, ambiguity checks, and TX Text Control's FormFieldCollection, the project provides a practical foundation for automatically preparing templates for JSON-based document workflows.

Stay in the loop!

Subscribe to the newsletter to receive the latest updates.

GitHub

Download and Fork This Sample on GitHub

We proudly host our sample code on github.com/TextControl.

Please fork and contribute.

Download ZIP

Open on GitHub

Open in Visual Studio

Requirements for this sample

  • TX Text Control .NET Server for ASP.NET 34.0
  • Visual Studio 2022 or later

ASP.NET

Integrate document processing into your applications to create documents such as PDFs and MS Word documents, including client-side document editing, viewing, and electronic signatures.

ASP.NET Core
Angular
Blazor
JavaScript
React
  • Angular
  • Blazor
  • React
  • JavaScript
  • ASP.NET MVC, ASP.NET Core, and WebForms

Learn more Trial token Download trial

Related Posts

ASP.NETASP.NET CoreDOCX

How to Import and Read Form Fields from DOCX Documents in .NET on Linux

Learn how to import and read form fields from DOCX documents in .NET on Linux using TX Text Control. This article provides a step-by-step guide to help you get started with form fields in TX Text…


ASP.NETASP.NET CoreForm Fields

Create Fillable PDF Forms in .NET C#

This article shows how to create fillable PDF forms in .NET C# using the TX Text Control .NET Server component. The created forms can be filled out using Adobe Reader or any other PDF viewer that…


ASP.NETASP.NET CoreForm Fields

Extension Method: Flatten Forms Fields in PDF Documents using .NET C#

This article shows how to flatten form fields in TX Text Control before exporting the document to PDF. This is a common requirement when documents should be shared with others and the form fields…

Share on this blog post on: