Advanced Usage

Agent Configuration

If the Datadog Agent is on a separate host from your application, you can modify the default ddtrace.tracer object to utilize another hostname and port. Here is a small example showcasing this:

from ddtrace import tracer

tracer.configure(hostname=<YOUR_HOST>, port=<YOUR_PORT>, https=<True/False>)

By default, these will be set to localhost, 8126, and False respectively.

You can also use a Unix Domain Socket to connect to the agent:

from ddtrace import tracer

tracer.configure(uds_path="/path/to/socket")

Distributed Tracing

To trace requests across hosts, the spans on the secondary hosts must be linked together by setting trace_id, parent_id and sampling_priority.

  • On the server side, it means to read propagated attributes and set them to the active tracing context.
  • On the client side, it means to propagate the attributes, commonly as a header/metadata.

ddtrace already provides default propagators but you can also implement your own.

Web Frameworks

Some web framework integrations support distributed tracing out of the box.

Supported web frameworks:

Framework/Library Enabled
aiohttp True
Bottle True
Django True
Falcon True
Flask True
Pylons True
Pyramid True
Requests True
Tornado True

HTTP Client

For distributed tracing to work, necessary tracing information must be passed alongside a request as it flows through the system. When the request is handled on the other side, the metadata is retrieved and the trace can continue.

To propagate the tracing information, HTTP headers are used to transmit the required metadata to piece together the trace.

Custom

You can manually propagate your tracing context over your RPC protocol. Here is an example assuming that you have rpc.call function that call a method and propagate a rpc_metadata dictionary over the wire:

# Implement your own context propagator
class MyRPCPropagator(object):
    def inject(self, span_context, rpc_metadata):
        rpc_metadata.update({
            'trace_id': span_context.trace_id,
            'span_id': span_context.span_id,
            'sampling_priority': span_context.sampling_priority,
        })

    def extract(self, rpc_metadata):
        return Context(
            trace_id=rpc_metadata['trace_id'],
            span_id=rpc_metadata['span_id'],
            sampling_priority=rpc_metadata['sampling_priority'],
        )

# On the parent side
def parent_rpc_call():
    with tracer.trace("parent_span") as span:
        rpc_metadata = {}
        propagator = MyRPCPropagator()
        propagator.inject(span.context, rpc_metadata)
        method = "<my rpc method>"
        rpc.call(method, metadata)

# On the child side
def child_rpc_call(method, rpc_metadata):
    propagator = MyRPCPropagator()
    context = propagator.extract(rpc_metadata)
    tracer.context_provider.activate(context)

    with tracer.trace("child_span") as span:
        span.set_meta('my_rpc_method', method)

Sampling

Priority Sampling

To learn about what sampling is check out our documentation here.

By default priorities are set on a trace by a sampler. The sampler can set the priority to the following values:

  • AUTO_REJECT: the sampler automatically rejects the trace
  • AUTO_KEEP: the sampler automatically keeps the trace

Priority sampling is enabled by default. When enabled, the sampler will automatically assign a priority to your traces, depending on their service and volume. This ensures that your sampled distributed traces will be complete.

You can also set this priority manually to either drop an uninteresting trace or to keep an important one. To do this, set the context.sampling_priority to one of the following:

  • USER_REJECT: the user asked to reject the trace
  • USER_KEEP: the user asked to keep the trace

When not using distributed tracing, you may change the priority at any time, as long as the trace is not finished yet. But it has to be done before any context propagation (fork, RPC calls) to be effective in a distributed context. Changing the priority after context has been propagated causes different parts of a distributed trace to use different priorities. Some parts might be kept, some parts might be rejected, and this can cause the trace to be partially stored and remain incomplete.

If you change the priority, we recommend you do it as soon as possible, when the root span has just been created:

from ddtrace.ext.priority import USER_REJECT, USER_KEEP

context = tracer.context_provider.active()

# indicate to not keep the trace
context.sampling_priority = USER_REJECT

Client Sampling

Client sampling enables the sampling of traces before they are sent to the Agent. This can provide some performance benefit as the traces will be dropped in the client.

The RateSampler randomly samples a percentage of traces:

from ddtrace.sampler import RateSampler

# Sample rate is between 0 (nothing sampled) to 1 (everything sampled).
# Keep 20% of the traces.
sample_rate = 0.2
tracer.sampler = RateSampler(sample_rate)

App Analytics

Use App Analytics to filter application performance metrics and analyzed spans by user-defined tags. An APM event is generated every time a trace is generated.

Enabling analyzed spans for all web frameworks can be accomplished by setting the environment variable DD_TRACE_ANALYTICS_ENABLED=true:

For most libraries, analyzed spans can be enabled with the environment variable DD_TRACE_{INTEGRATION}_ANALYTICS_ENABLED=true:

Library Environment Variable
aiobotocore DD_TRACE_AIOBOTOCORE_ANALYTICS_ENABLED
aiopg DD_TRACE_AIOPG_ANALYTICS_ENABLED
Boto DD_TRACE_BOTO_ANALYTICS_ENABLED
Botocore DD_TRACE_BOTOCORE_ANALYTICS_ENABLED
Bottle DD_TRACE_BOTTLE_ANALYTICS_ENABLED
Cassandra DD_TRACE_CASSANDRA_ANALYTICS_ENABLED
Celery DD_TRACE_CELERY_ANALYTICS_ENABLED
Elasticsearch DD_TRACE_ELASTICSEARCH_ANALYTICS_ENABLED
Falcon DD_TRACE_FALCON_ANALYTICS_ENABLED
Flask DD_TRACE_FLASK_ANALYTICS_ENABLED
Flask Cache DD_TRACE_FLASK_CACHE_ANALYTICS_ENABLED
Grpc DD_TRACE_GRPC_ANALYTICS_ENABLED
httplib DD_TRACE_HTTPLIB_ANALYTICS_ENABLED
Kombu DD_TRACE_KOMBU_ANALYTICS_ENABLED
Molten DD_TRACE_MOLTEN_ANALYTICS_ENABLED
pylibmc DD_TRACE_PYLIBMC_ANALYTICS_ENABLED
Pylons DD_TRACE_PYLONS_ANALYTICS_ENABLED
pymemcache DD_TRACE_PYMEMCACHE_ANALYTICS_ENABLED
Pymongo DD_TRACE_PYMONGO_ANALYTICS_ENABLED
redis DD_TRACE_REDIS_ANALYTICS_ENABLED
redis-py-cluster DD_TRACE_REDISCLUSTER_ANALYTICS_ENABLED
SQLAlchemy DD_TRACE_SQLALCHEMY_ANALYTICS_ENABLED
Vertica DD_TRACE_VERTICA_ANALYTICS_ENABLED

For datastore libraries that extend another, use the setting for the underlying library:

Library Environment Variable
Mongoengine DD_TRACE_PYMONGO_ANALYTICS_ENABLED
mysql-connector DD_TRACE_DBAPI2_ANALYTICS_ENABLED
mysqlclient/MySQL-python DD_TRACE_DBAPI2_ANALYTICS_ENABLED
psycopg DD_TRACE_DBAPI2_ANALYTICS_ENABLED
pymysql DD_TRACE_DBAPI2_ANALYTICS_ENABLED
SQLite DD_TRACE_DBAPI2_ANALYTICS_ENABLED

Where environment variables are not used for configuring the tracer, the instructions for configuring app analytics are provided in the library documentation:

Resolving deprecation warnings

Before upgrading, it’s a good idea to resolve any deprecation warnings raised by your project. These warnings must be fixed before upgrading, otherwise the ddtrace library will not work as expected. Our deprecation messages include the version where the behavior is altered or removed.

In Python, deprecation warnings are silenced by default. To enable them you may add the following flag or environment variable:

$ python -Wall app.py

# or

$ PYTHONWARNINGS=all python app.py

Trace Filtering

It is possible to filter or modify traces before they are sent to the Agent by configuring the tracer with a filters list. For instance, to filter out all traces of incoming requests to a specific url:

Tracer.configure(settings={
    'FILTERS': [
        FilterRequestsOnUrl(r'http://test\.example\.com'),
    ],
})

All the filters in the filters list will be evaluated sequentially for each trace and the resulting trace will either be sent to the Agent or discarded depending on the output.

Use the standard filters

The library comes with a FilterRequestsOnUrl filter that can be used to filter out incoming requests to specific urls:

Write a custom filter

Creating your own filters is as simple as implementing a class with a process_trace method and adding it to the filters parameter of Tracer.configure. process_trace should either return a trace to be fed to the next step of the pipeline or None if the trace should be discarded:

class FilterExample(object):
    def process_trace(self, trace):
        # write here your logic to return the `trace` or None;
        # `trace` instance is owned by the thread and you can alter
        # each single span or the whole trace if needed

# And then instantiate it with
filters = [FilterExample()]
Tracer.configure(settings={'FILTERS': filters})

(see filters.py for other example implementations)

Logs Injection

HTTP layer

Query String Tracing

It is possible to store the query string of the URL — the part after the ? in your URL — in the url.query.string tag.

Configuration can be provided both at the global level and at the integration level.

Examples:

from ddtrace import config

# Global config
config.http.trace_query_string = True

# Integration level config, e.g. 'falcon'
config.falcon.http.trace_query_string = True

Headers tracing

For a selected set of integrations, it is possible to store http headers from both requests and responses in tags.

Configuration can be provided both at the global level and at the integration level.

Examples:

from ddtrace import config

# Global config
config.trace_headers([
    'user-agent',
    'transfer-encoding',
])

# Integration level config, e.g. 'falcon'
config.falcon.http.trace_headers([
    'user-agent',
    'some-other-header',
])
The following rules apply:
  • headers configuration is based on a whitelist. If a header does not appear in the whitelist, it won’t be traced.
  • headers configuration is case-insensitive.
  • if you configure a specific integration, e.g. ‘requests’, then such configuration overrides the default global configuration, only for the specific integration.
  • if you do not configure a specific integration, then the default global configuration applies, if any.
  • if no configuration is provided (neither global nor integration-specific), then headers are not traced.

Once you configure your application for tracing, you will have the headers attached to the trace as tags, with a structure like in the following example:

http {
  method  GET
  request {
    headers {
      user_agent  my-app/0.0.1
    }
  }
  response {
    headers {
      transfer_encoding  chunked
    }
  }
  status_code  200
  url  https://api.github.com/events
}

OpenTracing

The Datadog opentracer can be configured via the config dictionary parameter to the tracer which accepts the following described fields. See below for usage.

Configuration Key Description Default Value
enabled enable or disable the tracer True
debug enable debug logging False
agent_hostname hostname of the Datadog agent to use localhost
agent_https use https to connect to the agent False
agent_port port the Datadog agent is listening on 8126
global_tags tags that will be applied to each span {}
sampler see Sampling AllSampler
priority_sampling see Priority Sampling True
uds_path unix socket of agent to connect to None
settings see Advanced Usage {}

Usage

Manual tracing

To explicitly trace:

import time
import opentracing
from ddtrace.opentracer import Tracer, set_global_tracer

def init_tracer(service_name):
    config = {
      'agent_hostname': 'localhost',
      'agent_port': 8126,
    }
    tracer = Tracer(service_name, config=config)
    set_global_tracer(tracer)
    return tracer

def my_operation():
  span = opentracing.tracer.start_span('my_operation_name')
  span.set_tag('my_interesting_tag', 'my_interesting_value')
  time.sleep(0.05)
  span.finish()

init_tracer('my_service_name')
my_operation()

Context Manager Tracing

To trace a function using the span context manager:

import time
import opentracing
from ddtrace.opentracer import Tracer, set_global_tracer

def init_tracer(service_name):
    config = {
      'agent_hostname': 'localhost',
      'agent_port': 8126,
    }
    tracer = Tracer(service_name, config=config)
    set_global_tracer(tracer)
    return tracer

def my_operation():
  with opentracing.tracer.start_span('my_operation_name') as span:
    span.set_tag('my_interesting_tag', 'my_interesting_value')
    time.sleep(0.05)

init_tracer('my_service_name')
my_operation()

See our tracing trace-examples repository for concrete, runnable examples of the Datadog opentracer.

See also the Python OpenTracing repository for usage of the tracer.

Alongside Datadog tracer

The Datadog OpenTracing tracer can be used alongside the Datadog tracer. This provides the advantage of providing tracing information collected by ddtrace in addition to OpenTracing. The simplest way to do this is to use the ddtrace-run command to invoke your OpenTraced application.

Examples

Celery

Distributed Tracing across celery tasks with OpenTracing.

  1. Install Celery OpenTracing:

    pip install Celery-OpenTracing

  2. Replace your Celery app with the version that comes with Celery-OpenTracing:

    from celery_opentracing import CeleryTracing from ddtrace.opentracer import set_global_tracer, Tracer

    ddtracer = Tracer() set_global_tracer(ddtracer)

    app = CeleryTracing(app, tracer=ddtracer)

Opentracer API

ddtrace-run

ddtrace-run will trace supported web frameworks and database modules without the need for changing your code:

$ ddtrace-run -h

Execute the given Python program, after configuring it
to emit Datadog traces.

Append command line arguments to your program as usual.

Usage: ddtrace-run <my_program>

The environment variables for ddtrace-run used to configure the tracer are detailed in Configuration.

ddtrace-run respects a variety of common entrypoints for web applications:

  • ddtrace-run python my_app.py
  • ddtrace-run python manage.py runserver
  • ddtrace-run gunicorn myapp.wsgi:application
  • ddtrace-run uwsgi --http :9090 --wsgi-file my_app.py

Pass along command-line arguments as your program would normally expect them:

$ ddtrace-run gunicorn myapp.wsgi:application --max-requests 1000 --statsd-host localhost:8125

If you’re running in a Kubernetes cluster and still don’t see your traces, make sure your application has a route to the tracing Agent. An easy way to test this is with a:

$ pip install ipython
$ DATADOG_TRACE_DEBUG=true ddtrace-run ipython

Because iPython uses SQLite, it will be automatically instrumented and your traces should be sent off. If an error occurs, a message will be displayed in the console, and changes can be made as needed.

uWSGI

The default configuration of uWSGI applications does not include the --enable-threads setting which must be set to true for the tracing library to run. This is noted in their best practices doc.

Example run command:

ddtrace-run uwsgi --http :9090 --wsgi-file your_app.py --enable-threads

API

Tracer

Span

Pin

patch_all

patch