API Data Formats: JSON, XML, and More

API data formats define how data is structured and exchanged between systems.

API Data Formats

APIs exchange data in structured formats that both the sending system and the receiving system can read and parse. Choosing the right format for your API affects developer experience, performance, and long-term compatibility with other systems.

What Is an API Data Format

When an API sends or receives data, it needs a consistent structure that both sides understand. A data format defines the syntax and meaning of that data, including how fields are named, how data types are expressed, and how the overall document is structured. Without a shared format, the client and server cannot interpret each other's messages.

The format being used is declared in the HTTP request or response using the Content-Type header. For example, a server returning JSON will include Content-Type: application/json. A server returning XML will use Content-Type: application/xml. This tells the receiving side how to parse the body of the message.

JSON: The Modern Standard

JSON (JavaScript Object Notation) is the de facto standard for REST APIs today. It is human-readable, lightweight, and natively supported in every major programming language and browser. JSON grew popular alongside the rise of web applications because JavaScript, which runs in every browser, can parse it directly without any extra libraries.

A JSON document consists of key-value pairs wrapped in curly braces. Values can be strings, numbers, booleans, null, arrays, or nested objects. This small set of types covers the majority of what APIs need to express.

{
  "id": 42,
  "name": "Alice",
  "email": "alice@example.com",
  "active": true
}

JSON is compact compared to XML, easy to read during debugging, and well supported by tooling including Postman, Swagger, and most API gateways. For new public-facing REST APIs, JSON is almost always the right default.

XML: The Legacy Format

XML (Extensible Markup Language) was the dominant API data format before JSON became widespread. It uses opening and closing tags to wrap values, which makes documents verbose but also highly expressive. XML supports attributes on elements, namespaces for avoiding naming conflicts, and strict validation through XML Schema Definition (XSD) files.

<user>
  <id>42</id>
  <name>Alice</name>
  <email>alice@example.com</email>
</user>

XML is still common in enterprise systems, SOAP APIs, RSS and Atom feeds, and financial or banking integrations where strict schema validation is a requirement. If you are connecting to a legacy system or a government data service, there is a good chance it expects XML. The format is not going away, but for new projects it is rarely the first choice.

YAML: Readable but Rarely Used in APIs

YAML (YAML Ain't Markup Language) is designed for human readability. It uses indentation rather than brackets or tags to define structure, which makes it very clean to look at. YAML supports rich types including dates, multi-line strings, and comments, which JSON does not support natively.

However, YAML is almost never used as a response format in APIs. Its primary role is in configuration files, CI/CD pipelines, and tools like Kubernetes or Ansible. The OpenAPI specification, which is used to document REST APIs, is commonly written in YAML. So while YAML is not a format you will typically send or receive over an API, you will encounter it when reading API documentation files.

Protocol Buffers: Built for Performance

Protocol Buffers, often called Protobuf, is a binary serialization format developed by Google. Unlike JSON or XML, Protobuf is not human-readable. Data is encoded in a compact binary form that is significantly smaller and faster to parse than text-based formats.

To use Protobuf, you define your data structure in a .proto schema file. The schema is compiled into code for each language you need, and that generated code handles serialization and deserialization automatically. This strict typing catches errors at compile time rather than at runtime.

Protobuf is the format used by gRPC, a high-performance remote procedure call framework also developed by Google. It is widely used in internal microservice communication where throughput and latency matter more than human readability. For public APIs where developers need to inspect responses easily, JSON remains the more practical choice.

Format Comparison

Feature JSON XML YAML Protocol Buffers
Readability Good Moderate (verbose) Excellent Binary (not readable)
Payload Size Compact Large (tag overhead) Compact Very small
Parse Speed Fast Slower Slow Extremely fast
Type Support Limited (string, number, bool, null) All text (types via schema) Rich types Strongly typed
Schema Support JSON Schema (optional) XSD (strict) Optional Required (.proto file)
Best For REST APIs, web apps SOAP, enterprise, XML feeds Config files, API docs High-performance internal APIs (gRPC)

Content Negotiation

Some APIs support more than one format and let the client choose which one to use. This is handled through a process called content negotiation. The client includes an Accept header in its request specifying the preferred format. The server reads that header and responds in the matching format if it supports it.

  • Accept: application/json requests a JSON response
  • Accept: application/xml requests an XML response
  • If the server does not support the requested format, it typically returns a 406 Not Acceptable status code

Supporting multiple formats adds complexity to your API. Unless you have a clear need to serve both JSON and XML consumers, it is simpler to pick one format and document it clearly.

Choosing the Right Format

The format you choose should match the context of your API and the expectations of the developers who will use it. Here are the most common scenarios and the recommended choice for each.

  • Building a public REST API: Use JSON. It is what developers expect, and all HTTP clients handle it natively without additional configuration.
  • Integrating with a legacy enterprise system: Check the system's documentation. XML is likely required, especially for SOAP-based services.
  • Internal microservice communication with high throughput: Consider Protocol Buffers with gRPC for reduced payload size and faster parsing.
  • Writing API documentation or configuration files: YAML is the standard choice for OpenAPI specs and infrastructure configuration.
  • Working with RSS or Atom feeds: XML is the format used by all major feed standards.

Setting Content-Type Correctly

Regardless of the format you use, always set the Content-Type header on both requests and responses. Many parsers and API gateways will reject messages that do not declare their format. Some clients will fall back to guessing the format from the body, which can lead to unpredictable errors.

  • For JSON: Content-Type: application/json
  • For XML: Content-Type: application/xml
  • For form submissions: Content-Type: application/x-www-form-urlencoded
  • For file uploads: Content-Type: multipart/form-data

Frequently Asked Questions

  1. What Content-Type should I set for JSON APIs?
    Use Content-Type: application/json on both requests and responses. Always set it explicitly. Some clients and parsers reject responses that do not declare their content type, even if the body looks like valid JSON.
  2. Can a single API support multiple formats?
    Yes. The client sends an Accept: application/json or Accept: application/xml header, and the server responds in the preferred format. This is called content negotiation. Supporting multiple formats is possible but adds development and maintenance overhead, so only do it when you have a genuine need.
  3. Is JSON always better than XML?
    Not always. XML supports attributes, mixed content, comments, and well-defined namespaces. For documents with complex metadata, such as an XHTML page, a Word document, or a financial transaction record that requires strict schema validation, XML is often more expressive and appropriate than JSON.
  4. What is the difference between JSON and Protocol Buffers?
    JSON is a text-based format that is easy to read and debug. Protocol Buffers encode data in binary, which is faster to parse and produces smaller payloads, but you cannot read it directly without tooling. JSON is better for public APIs; Protocol Buffers are better for internal services where performance is a priority.
  5. Do I need a schema for JSON?
    A schema is optional for JSON. You can use JSON Schema to define and validate the structure of your data, which helps catch errors early and generates useful documentation. For small or internal APIs, many teams skip formal schemas. For public APIs with many consumers, a schema provides clear contracts and improves reliability.

Conclusion

JSON dominates modern REST APIs for good reason. It is readable, compact, and universally supported across programming languages, frameworks, and tooling. XML remains relevant in enterprise and legacy systems where strict validation and namespacing are important. Protocol Buffers are the right choice for high-performance internal APIs built on gRPC. YAML serves its purpose in configuration files and API documentation rather than in live API responses. Matching the format to your specific use case is a straightforward but important API design decision that affects everyone who builds on top of your service. To learn more, explore REST APIs, HTTP and HTTPS, and HTTP content types.