> For the complete documentation index, see [llms.txt](https://muhammed-hatem.gitbook.io/muhammed-hatem/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://muhammed-hatem.gitbook.io/muhammed-hatem/siem-and-soc/splunk.md).

# Splunk

### <mark style="color:orange;">**What is Splunk?**</mark>

**Splunk** is a **data analytics and monitoring platform** that ingests, indexes, and analyzes machine-generated data (logs, metrics, traces) for security (SIEM), IT operations (ITSM) .

### <mark style="color:orange;">**Key Components**</mark>

| Component               | Role                                         |
| ----------------------- | -------------------------------------------- |
| **`Forwarder`**         | Collects and forwards data (no processing).  |
| **`Indexer`**           | Parses, indexes, and stores data.            |
| **`Search Head`**       | Executes SPL queries and visualizes results. |
| **`Deployment Server`** | Manages configurations for forwarders.       |

### <mark style="color:orange;">**How Splunk Works (Step-by-Step)**</mark>

<figure><img src="/files/LEDzz0EGXG6UmEfAB6NZ" alt="" width="375"><figcaption></figcaption></figure>

#### <mark style="color:blue;">**1. Data Ingestion**</mark>

* **Sources**: Logs (files, APIs, syslog), metrics (CPU, memory), and streaming data (Kafka).
* **Forwarders**: Lightweight agents (**Splunk Universal Forwarder**) collect and send data to Splunk.

#### <mark style="color:blue;">**2. Indexing**</mark>

* **Parsing**: Splunk extracts key fields (timestamps, hostnames, event types).
* **Indexing**: Data is stored in **time-series indexes** for fast retrieval.

#### <mark style="color:blue;">**3. Search & Analysis**</mark>

* **Search Processing Language (SPL)**: Splunk’s query language (e.g., `index=security | stats count by src_ip`).
* **Correlation**: Detects patterns (e.g., brute-force attacks).

#### <mark style="color:blue;">**4. Visualization & Alerts**</mark>

* **Dashboards**: Custom charts/tables in Splunk Web.
* **Alerts**: Trigger actions (email, webhook) when thresholds are breached.

#### <mark style="color:blue;">**5. Storage & Retention**</mark>

* **Hot/Warm/Cold Buckets**: Automatically moves data to cheaper storage over time.

Here's a **text-based Splunk Query Language (SPL) guide** that you can save as a `.txt` file or copy directly into any text editor:

***

### <mark style="color:orange;">**SPL (Splunk Query Language) Complete Reference**</mark>

#### <mark style="color:blue;">**1. BASIC SEARCHES**</mark>

```splunk
error                         # Simple text search  
"connection timeout"          # Phrase search  
sourcetype=access_*           # Wildcard source  
status=404                    # Exact field match  
bytes>1000                    # Numeric comparison  
```

#### <mark style="color:blue;">**2. BOOLEAN OPERATORS**</mark>

```splunk
(failed OR error)             # OR condition  
status=200 AND method=POST    # AND condition  
NOT client_ip=192.168.1.*     # Exclusion  
```

#### <mark style="color:blue;">**3. FIELD EXTRACTION**</mark>

```splunk
# Regex extraction  
| rex "user=(?<username>\w+)"  

# JSON extraction  
| spath input=json_field  

# Create new field  
| eval mb=bytes/1024/1024  
```

#### <mark style="color:blue;">**4. STATISTICAL COMMANDS**</mark>

```splunk
# Count events by field  
| stats count by user  

# Time-based aggregation  
| timechart span=1h count by status  

# Top values  
| top 10 client_ip  
```

#### <mark style="color:blue;">**5. TIME FILTERS**</mark>

```splunk
# Relative time  
earliest=-24h latest=now  

# Absolute time  
earliest="06/01/2023:00:00:00" latest="06/02/2023:00:00:00"  

# Time bucketing  
| bin _time span=15m  
```

#### <mark style="color:blue;">**6. ADVANCED EXAMPLES**</mark>

<mark style="color:red;">**Security Alert (Brute Force):**</mark>

```splunk
sourcetype=auth failed  
| stats count by src_ip  
| where count>5  
| sort -count  
```

**Application Performance:**

```splunk
sourcetype=nginx response_time>2000  
| stats avg(response_time) by app_name  
```

#### <mark style="color:blue;">**7. TROUBLESHOOTING**</mark>

| **Issue**        | **Solution**                 |
| ---------------- | ---------------------------- |
| No results       | Check sourcetype/time range  |
| Slow performance | Add time filters, use tstats |
| Memory errors    | Reduce scope with sampling   |

***


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://muhammed-hatem.gitbook.io/muhammed-hatem/siem-and-soc/splunk.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
