Skip to main content

Api Troubleshooting

Deep API Troubleshooting Guide

Goal

API troubleshooting is about finding where the failure is happening:

Client/App → Network → API Gateway → Authentication → Backend Service → Database/External System

A good implementation engineer isolates the layer before escalating.


1. Start With 4 Basic Questions

Question Why It Matters
What is expected to happen? Defines success
What actually happens? Defines the failure
Is it reproducible? Separates one-time issue from systemic issue
Who is affected? Helps identify scope

Example:

Expected: CRM ticket is created after chat ends.
Actual: Chat ends, but no CRM ticket appears.
Scope: One customer environment, all agents.

2. Identify the API Direction

Direction Meaning Example
Inbound API Customer system calls platform API CRM calls Glia API
Outbound API Platform calls customer system Glia calls CRM API
Webhook Event notification sent automatically Chat ended → CRM ticket
Callback Response sent after async process Payment status update

This matters because it tells you which side to check first.


3. Check the HTTP Method

Method Troubleshooting Focus
GET Is the resource ID correct?
POST Is the payload valid?
PUT Are all required fields included?
PATCH Are updated fields allowed?
DELETE Does the user/token have delete permission?

Example issue:

GET /api/create-ticket

But the API expects:

POST /api/create-ticket

Likely result:

405 Method Not Allowed

4. Check the Endpoint

Validate:

Base URL
API version
Path
Resource ID
Query parameters
Environment

Example:

https://api.vendor.com/v1/customers/12345

Common mistakes:

Problem Example
Wrong environment Using sandbox instead of production
Wrong API version /v1/ instead of /v2/
Typo in endpoint /costumers/ instead of /customers/
Missing resource ID /customers/ instead of /customers/12345

5. Check Headers

Important headers:

Header Purpose
Authorization Sends bearer token/API key
Content-Type Format of request body
Accept Format expected in response
x-api-key API key authentication
Correlation-ID Request tracking

Example:

Authorization: Bearer eyJhbGc...
Content-Type: application/json
Accept: application/json

Common issue:

Content-Type missing

API may not understand the JSON body.


6. Check Authentication

Most API failures happen here.

Error Meaning Common Cause
401 Not authenticated Missing/expired/invalid token
403 Authenticated but blocked Missing permission/scope/role

401 Checklist

Check:

Is Authorization header present?
Is token expired?
Is token copied correctly?
Is token for the right environment?
Was token revoked?

403 Checklist

Check:

Does token have correct scope?
Does user/app have permission?
Is IP allowlisting required?
Is this endpoint restricted?

7. Check OAuth Scope

Scopes define what the token can do.

Example:

read:customers
write:tickets
admin

If token only has:

read:customers

but you call:

POST /api/tickets

You may get:

403 Forbidden

8. Check the JSON Payload

For POST/PUT/PATCH, inspect payload carefully.

Example:

{
  "customerId": "12345",
  "channel": "chat",
  "authenticated": true
}

Validate:

Item Example Problem
Syntax Missing comma
Required fields Missing customerId
Data type "true" instead of true
Field name customerID instead of customerId
Nesting Object expected, string sent
Array format Single value sent instead of list

9. Understand Status Codes by Category

Code Range Meaning
2xx Success
3xx Redirect
4xx Client/request issue
5xx Server/backend issue

Important Codes

Code Meaning First Place To Look
200 Success Validate response body
201 Created Confirm resource ID returned
204 No Content Success, but no body
400 Bad Request Payload/parameters
401 Unauthorized Authentication
403 Forbidden Permissions/scopes
404 Not Found Endpoint/resource
405 Method Not Allowed HTTP method
408 Timeout Network/backend delay
409 Conflict Duplicate/existing resource
415 Unsupported Media Type Content-Type
422 Validation error Field validation
429 Rate limit Too many requests
500 Server error Backend logs
502 Bad gateway Proxy/upstream
503 Unavailable Outage/maintenance
504 Gateway timeout Backend too slow

10. Troubleshooting by Error Code

400 Bad Request

Likely:

Invalid payload, missing fields, wrong query parameter

Check:

JSON syntax
Required fields
Field names
Data types
API documentation

401 Unauthorized

Likely:

Authentication failed

Check:

Token present
Token expired
Correct auth method
Correct environment

403 Forbidden

Likely:

Permission issue

Check:

User role
OAuth scopes
API permissions
IP allowlist

404 Not Found

Likely:

Wrong endpoint or resource does not exist

Check:

URL path
API version
Resource ID
Environment

409 Conflict

Likely:

Duplicate or conflicting resource

Example:

Trying to create a user that already exists

415 Unsupported Media Type

Likely:

Wrong or missing Content-Type

Check:

Content-Type: application/json

422 Validation Error

Likely:

Payload is valid JSON but fails business rules

Example:

Phone number format invalid
Required field has invalid value

429 Too Many Requests

Likely:

Rate limit exceeded

Check:

Retry-After header
API rate limits
Looping automation
Retry logic

500 Internal Server Error

Likely:

Backend application error

Action:

Collect request ID, timestamp, payload sample, and escalate

502 / 503 / 504

Likely:

Gateway, service availability, or timeout issue

Check:

Load balancer
API gateway
Backend health
Maintenance window
Network latency

11. Network Layer Checks

APIs depend on network reachability.

Check:

DNS resolution
Firewall
Proxy
Port 443
TLS certificate
IP allowlisting
VPN/private connectivity

Common symptoms:

Symptom Possible Cause
Timeout Firewall, proxy, backend delay
TLS error Certificate or TLS mismatch
DNS failure Wrong hostname
Connection refused Service down or port blocked

12. TLS / Certificate Checks

For HTTPS APIs, validate:

Certificate not expired
Hostname matches certificate
Certificate chain trusted
TLS 1.2 or 1.3 supported

Common failures:

certificate expired
self-signed certificate
hostname mismatch
TLS handshake failed

13. Rate Limiting

Rate limiting protects APIs from overload.

Common response:

429 Too Many Requests

Check headers:

Retry-After: 60

Troubleshooting:

Reduce request frequency
Add backoff/retry logic
Check if automation is looping
Review API quota

14. Timeouts

Timeouts can occur at multiple layers:

Client timeout
API gateway timeout
Load balancer timeout
Backend service timeout
Database timeout

Ask:

How long before failure?
Does it fail always or intermittently?
Is payload large?
Is backend dependency slow?

15. Correlation IDs

A correlation ID tracks one request across systems.

Example:

X-Correlation-ID: abc-123

Use it to trace:

API Gateway → Auth Service → Backend Service → Database

For escalation, always include:

correlation ID
timestamp
endpoint
HTTP method
response code

16. Logs To Review

Log Type What It Shows
API Gateway logs Request entry, routing, response code
Auth logs OAuth/token failures
Application logs Backend errors
Webhook logs Delivery attempts
Load balancer logs Upstream health/timeouts
Firewall/proxy logs Blocked connections
CRM logs Customer/ticket integration failures

17. Good Troubleshooting Workflow

1. Confirm expected behavior
2. Reproduce issue
3. Identify endpoint + method
4. Check response code
5. Validate auth
6. Validate headers
7. Validate payload
8. Validate network/TLS
9. Check logs/correlation ID
10. Isolate failed layer
11. Escalate with evidence

18. Example Scenario

Issue

Chat ends but CRM ticket is not created.

Expected Flow

Chat completed → Webhook sent → CRM API creates ticket

Troubleshooting

Check:

Was chat_completed event generated?
Was webhook sent?
Did CRM receive webhook?
What HTTP response did CRM return?
Was token valid?
Was JSON payload accepted?
Was ticket required field missing?

Possible findings:

Finding Meaning
No event generated Platform workflow issue
Webhook sent, no response CRM endpoint/network issue
401 from CRM Auth/token issue
400 from CRM Payload/field mapping issue
500 from CRM CRM backend issue

19. Another Scenario

Issue

Customer lookup fails during chat start.

Expected Flow

Customer enters → API calls CRM → CRM returns profile

Check:

Customer ID passed correctly?
CRM endpoint correct?
Token valid?
Customer exists?
Field mapping correct?
CRM API response code?

Likely outcomes:

Response Likely Issue
401 Token expired
403 Missing CRM permission
404 Customer not found
400 Bad customerId format
500 CRM backend failure

20. What To Say In Interview

API Troubleshooting Answer

“I start by confirming the expected workflow and reproducing the issue. Then I identify the API endpoint, HTTP method, response code, headers, authentication method, and payload. Based on the response code, I isolate whether the issue is authentication, permissions, request formatting, endpoint, networking, or backend related. I also check logs, timestamps, and correlation IDs before escalating with clear evidence.”


21. Implementation Engineer Mindset

Do not just say:

The API failed.

Say:

The POST request to the CRM ticket endpoint is returning 400 because the payload is missing the required caseType field.

That shows strong technical ownership.