dt-obs-logs
Log querying, filtering, pattern analysis, and error rate calculation. Use when searching application or infrastructure logs, analyzing error patterns, or correlating log data. Trigger: "show error logs", "search logs for keyword", "log error rate", "recent errors", "logs from last hour", "find log entries", "top error messages", "log patterns", "parse JSON logs", "logs by process group", "log trends over time", "log entry counts per minute". Do NOT use for explaining existing queries, product documentation questions, distributed tracing or span analysis (use dt-obs-tracing).
Skill body
Log Analysis Skill
Query, filter, and analyze Dynatrace log data using DQL for troubleshooting and monitoring.
What This Skill Covers
- Fetching and filtering logs by severity, content, and entity
- Searching log messages using pattern matching
- Calculating error rates and statistics
- Analyzing log patterns and trends
- Grouping and aggregating log data by dimensions
Cross-source join required: If the query must combine logs with host attributes (OS type, hostname, IP address, cloud provider) → also read
dt-dql-essentials/references/smartscape-topology-navigation.mdbefore writing the query.
Use Cases
Use this skill when users want to:
- Find specific log entries (e.g., “show me error logs from the last hour”)
- Filter logs by severity, process group, or content
- Search logs for specific keywords or phrases
- Calculate error rates or log statistics
- Identify common error messages or patterns
- Analyze log trends over time
- Troubleshoot issues using log data
Key Concepts
Log Data Model
- timestamp: When the log entry was created
- content: The log message text
- status: Log level (ERROR, FATAL, WARN, INFO, etc.)
- dt.process_group.id: Associated process group entity
- dt.process_group.detected_name: Resolves process group IDs to human-readable names
Query Patterns
- fetch logs: Primary command for log data access
- Time ranges: Use
from:now() - <duration>for time windows - Filtering: Apply severity, content, and entity filters
- Aggregation: Group and summarize log data
- Pattern Detection: Use
matchesPhrase()andcontains()for content search
Common Operations
- Severity filtering (single or multiple levels)
- Content search (simple and full-text)
- Entity-based filtering (process groups)
- Time-series analysis (bucketing, sorting)
- Error rate calculation
- Pattern analysis (exceptions, timeouts, etc.)
Core Workflows
1. Log Searching
Find specific log entries by time, severity, and content.
Typical steps:
- Define time range
- Filter by severity (optional)
- Search content for keywords
- Select relevant fields
- Sort and limit results
Example:
fetch logs, from:now() - 1h
| filter status == "ERROR"
| fields timestamp, content, process_group = dt.process_group.detected_name
| sort timestamp desc
| limit 100
2. Log Filtering
Narrow down logs using multiple criteria (severity, entity, content).
Typical steps:
- Fetch logs with time range
- Apply severity filters
- Filter by entity (process_group)
- Apply content filters
- Format and sort output
Example:
fetch logs, from:now() - 2h
| filter in(status, {"ERROR", "FATAL", "WARN"})
| summarize count(), by: {dt.process_group.id, dt.process_group.detected_name}
| fieldsAdd process_group = dt.process_group.detected_name
| sort `count()` desc
3. Pattern Analysis
Identify patterns, trends, and anomalies in log data.
Typical steps:
- Fetch logs with time range
- Add pattern detection fields
- Aggregate by entity or time
- Calculate statistics and ratios
- Sort by frequency or rate
Example:
fetch logs, from:now() - 2h
| filter status == "ERROR"
| fieldsAdd
has_exception = if(matchesPhrase(content, "exception"), true, else: false),
has_timeout = if(matchesPhrase(content, "timeout"), true, else: false)
| summarize
count(),
exception_count = countIf(has_exception == true),
timeout_count = countIf(has_timeout == true),
by: {process_group = dt.process_group.detected_name}
Key Functions
Filtering
filter status == "ERROR"- Filter by status levelin(status, {"ERROR", "FATAL", "WARN"})- Multi-status filter (use curly braces for literal sets)contains(content, "keyword")- Simple substring searchmatchesPhrase(content, "exact phrase")- Full-text phrase search
Entity Operations
dt.process_group.detected_name- Get human-readable process group namefilter process_group == "service-name"- Filter by specific entity
Aggregation
count()- Count all log entriescountIf(condition)- Conditional countby: {dimension}- Group by entity or time bucketbin(timestamp, 5m)- Time bucketing for trends
Field Operations
fields timestamp, content, status- Select specific fieldsfieldsAdd name = expression- Add computed fieldsif(condition, true_value, else: false_value)- Conditional logic
Common Patterns
Content Search
Simple substring search:
fetch logs, from:now() - 1h
| filter contains(content, "database")
| fields timestamp, content, status
Full-text phrase search:
fetch logs, from:now() - 1h
| filter matchesPhrase(content, "connection timeout")
| fields timestamp, content, process_group = dt.process_group.detected_name
Error Rate Calculation
Calculate error rates over time:
fetch logs, from:now() - 2h
| summarize
total_logs = count(),
error_logs = countIf(status == "ERROR"),
by: {time_bucket = bin(timestamp, 5m)}
| fieldsAdd error_rate = (error_logs * 100.0) / total_logs
| sort time_bucket asc
Top Error Messages
Find most common errors:
fetch logs, from:now() - 24h
| filter status == "ERROR"
| summarize error_count = count(), by: {content}
| sort error_count desc
| limit 20
Process Group-Specific Logs
Filter logs by process group:
fetch logs, from:now() - 1h
| fieldsAdd process_group = dt.process_group.detected_name
| filter process_group == "payment-service"
| filter status == "ERROR"
| fields timestamp, content, status
| sort timestamp desc
Structured / JSON Log Parsing
Many applications emit JSON-formatted log lines. Use parse to extract fields instead of dumping raw content:
fetch logs, from:now() - 1h
| filter status == "ERROR"
| parse content, "JSON:log"
| fieldsAdd level = log[level], message = log[msg], error = log[error]
| fields timestamp, level, message, error
| sort timestamp desc
| limit 50
Aggregate by a parsed field:
fetch logs, from:now() - 4h
| filter status == "ERROR"
| parse content, "JSON:log"
| fieldsAdd message = log[msg]
| summarize error_count = count(), by: {message}
| sort error_count desc
| limit 20
Notes:
parse content, "JSON:log"creates a record fieldlog— access nested values withlog[key]- Filter logs with
contains()beforeparseto reduce parsing overhead - Works with any JSON-structured field, not just
content
Best Practices
- Always specify time ranges - Use
from:now() - <duration>to limit data - Apply filters early - Filter by severity and entity before aggregation
- Use appropriate search methods -
contains()for simple,matchesPhrase()for exact - Limit results - Add
| limit 100to prevent overwhelming output - Sort meaningfully - Sort by timestamp for recent logs, by count for top errors
- Name entities - Use
dt.process_group.detected_nameorgetNodeName()for human-readable output - Use time buckets for trends -
bin(timestamp, 5m)for time-series analysis
Integration Points
- Entity model: Uses
dt.process_group.idfor service correlation - Time series: Supports temporal analysis with
bin()and time ranges - Content search: Full-text search capabilities via
matchesPhrase() - Aggregation: Statistical analysis using
summarizeand conditional functions
Limitations & Notes
- Log availability depends on OneAgent configuration and log ingestion
- Full-text search (
matchesPhrase) may have performance implications on large datasets - Entity names require proper OneAgent monitoring for resolution
- Time ranges should be reasonable (avoid unbounded queries)
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
| No logs returned | Missing time range or too narrow | Widen from: window; verify log ingestion is active |
getNodeName() returns null |
OneAgent not monitoring the entity or entity not yet resolved | Verify OneAgent is deployed and entity is discovered; use dt.process_group.detected_name as a reliable alternative |
matchesPhrase() slow on large data |
Full-text search without pre-filtering | Add filter status == "ERROR" before matchesPhrase() |
Wrong field name log.level |
Common mistake | Use loglevel (no dot) for severity; see dt-dql-essentials |
Empty content field |
Log line was empty or not ingested | Check log source configuration in OneAgent |
Related Skills
- dt-dql-essentials - Core DQL syntax and query structure for log queries
- dt-obs-tracing - Correlate logs with distributed traces using trace IDs
- dt-obs-problems - Correlate logs with DAVIS-detected problems