YARA Rules
- David Tull
- Nov 14
- 7 min read
Updated: Nov 18
Background
I’m pretty lucky to have access to a lot of paid forensic tools, but I still find myself manually digging through extractions and databases. Honestly, I enjoy the “thrill of the hunt” when looking for new data sources. One database family I’ve been using frequently is the set associated with the Life360 application—it continues to be a goldmine.
Recently, I had a case involving a large mobile device extraction. The extraction was being processed through a paid tool, but the investigator needed location information immediately because the suspect was believed to be fleeing. To make things more interesting, it was late on a Friday afternoon, meaning that unless I wanted to work a very long day, that tool’s report wasn’t going to be ready until Monday.
So I opened up the zipped extraction and started hunting for the usual suspects. I quickly located the Life360 databases, ran them through my parser (GitHub link), and pulled the information the investigator needed. He confronted the suspect with the location data—who promptly admitted to lying and ended up giving a full confession hours later.
"Great Success!" - Borat
That went well, but I wanted a faster way to analyze large extractions, automatically find the databases I need, and extract them. Enter YARA. Below I’ll explain what YARA is, break down some of its nuances, and then show you the solution I’m currently using.
What Is YARA?
YARA (short for Yet Another Recursive Acronym) is a powerful pattern-matching tool commonly used in security and incident response. Teams often use it to identify and classify malware by looking for signatures, strings (Base64, hex, ASCII, regex), and even behavioral indicators.
That’s how I originally learned about YARA—even though, working mostly in Digital Forensics, I don’t get a ton of IR/NIT work. But YARA still has plenty of uses in DFIR, especially when it comes to scanning large datasets quickly.
Rule Structure
rule NameOfRule {
meta:
author:
strings:
$string_name = "String to look for."
condition:
$string_name
}Breaking it down:
rule
Starts the rule and assigns it a unique name.
meta
Contains descriptive information about the rule—whatever you want, but common fields include:
author
description
category
date
family
md5
version
strings
Defines what patterns YARA should search for. These can be:
ASCII
Hex
Wide (UTF-16)
Regex
Base64
XOR variants
And more
condition
The logic that determines when a match occurs. Examples include:
listing specific string names
"any of them"
"all of them"
"filesize > 100KB"
YARA String Examples
For the examples below, let’s assume we’re searching for this hex sequence:
FFD8FFE09A6ED2BFFFD9
Exact Match
$jpg = { FF D8 FF E0 9A 6E D2 BF FF D9 }Wildcard Bytes
$jpg_wildcard = { FF D8 FF ?? ?? 6E D2 BF FF D9 }NOT Operator
$jpg_not = { FF D8 FF E0 9A 6E D2 BF FF ~00 }Jump / Byte Range
$jpg_jump = { FF D8 FF [1-4] BF FF D9 }Infinite Wildcard
$jpg_jump2 = { FF D8 FF [ - ] FF D9 }OR Operator
$jpg_or = { FF D8 FF (E0 | E1) 9A 6E D2 BF FF D9 }Combined
$jpg_combine = { FF D8 FF (E0 | E1) [ - ] FF D9 }Additional String Modifiers
ascii
nocase
case-insensitive
wide
two-byte encoding
xor
single-byte XOR search
base64
searches Base64-encoded variants
fullword
matches only if delimited by non-alphanumeric chars
private
used internally, not exposed in results
regex (my love/hate relationship continues…)
If you’re like me and constantly forget regex syntax, I recommend keeping the documentation nearby.
My Solution
Since I enjoy writing in Python, I wanted to build something around it. I was also experimenting with Omarchy Linux and found that I really liked the intuitiveness of TUI-based tools. So I set out to build a TUI-based Python script that:
Accepts a path to an extraction
Accepts a custom set of YARA rules
Finds matching files quickly
Copies the matched files out for analysis
I started by writing a couple of YARA rules:
rule Life360 {
meta:
author = "Forensicator956"
description = "Yara rule to locate and extract databases relating to Life360."
category = "Location"
creation_date = "2025-11-14"
family = "All"
strings:
$db = "L360EventStore.db"
$db_service = "L360EventStore_service.db"
$db_failed = "FailedLocationTopic.sqlite"
condition:
any of them
}
rule VulgarWords {
meta:
author = "Forensicator956"
description = "Yara rule to locate the Vulgar Words database on iOS devices."
category = "Other"
creation_date = "2025-11-14"
family = "iOS"
strings:
$VulgarWords_db = "VulgarWordUsage.db"
condition:
$VulgarWords_db
}After getting the Python portion working, I asked my buddy Chad (ChatGPT) to help integrate the TUI components. I’ve never built one before and didn’t have time to learn it from scratch. I pasted in my code and let Chad clean it up, comment it, and add the TUI logic. You can thank Chad later—my code tends to favor function over visual appeal.
Inside the code, there’s a commented-out section where you can optionally scan inside files (including databases). If you uncomment it, YARA will examine file contents instead of just filenames. I’m cautious about opening unknown files in Python—reading alone shouldn’t execute anything, but better to be careful. It also slows the script dramatically: from under 2 seconds to over 12 minutes on a 40GB extraction.
Requirements:
tqdm
yara-python


#!/usr/bin/env python3
import curses
import os
import zipfile
import yara # type: ignore
import argparse
from typing import List, Tuple
from tqdm import tqdm
# ======================================================================
# ZIP SCANNING + YARA MATCHING
# ======================================================================
def scan_zip_for_matches(zip_path: str, rules) -> List[Tuple[str, List[str]]]:
"""
Scans a ZIP archive with YARA rules.
Returns a list of tuples: (file_path_inside_zip, matched_rule_list)
"""
matches = []
with zipfile.ZipFile(zip_path, "r") as z:
namelist = z.namelist()
print(f"[*] Scanning {len(namelist)} files with YARA…")
for name in tqdm(namelist, desc="Scanning", unit="file"):
matched = set()
# Filename matching
try:
for m in rules.match(data=name):
matched.add(m.rule)
except Exception:
pass
## NOTE: Content scanning disabled for speed & safety
## Uncomment carefully — opening unknown files can be slow/dangerous
##
# try:
# with z.open(name) as f:
# data = f.read()
# for m in rules.match(data=data):
# matched.add(m.rule)
# except Exception:
# pass
if matched:
matches.append((name, sorted(list(matched))))
return matches
def is_folder(path: str) -> bool:
"""Detects if a ZIP path entry represents a folder."""
return path.endswith("/")
def safe_extract(zip_path: str, files: List[str], out_dir: str):
"""
Extracts only the selected files.
Flattens paths so all files land directly inside out_dir/.
"""
os.makedirs(out_dir, exist_ok=True)
with zipfile.ZipFile(zip_path, "r") as z:
for f in files:
filename_only = os.path.basename(f)
target = os.path.join(out_dir, filename_only)
try:
with z.open(f) as fr, open(target, "wb") as fw:
fw.write(fr.read())
except Exception as e:
print(f"[!] Failed to extract {f}: {e}")
# ======================================================================
# CYBERPUNK CURSES UI
# ======================================================================
class CursesUI:
def __init__(self, matches, zip_path, out_dir):
self.matches = matches
self.zip_path = zip_path
self.out_dir = out_dir
self.selected = [False] * len(matches)
self.idx = 0
self.scroll = 0
# --------------------------------------------------------------
# Entry point
# --------------------------------------------------------------
def run(self):
curses.wrapper(self.main)
# --------------------------------------------------------------
# Color initialization
# --------------------------------------------------------------
def init_colors(self):
"""
Defines cyberpunk color palette for curses.
"""
curses.start_color()
# Basic neon cyberpunk palette
curses.init_pair(1, curses.COLOR_CYAN, curses.COLOR_BLACK) # header
curses.init_pair(2, curses.COLOR_MAGENTA, curses.COLOR_BLACK) # highlight
curses.init_pair(3, curses.COLOR_GREEN, curses.COLOR_BLACK) # selected file
curses.init_pair(4, curses.COLOR_YELLOW, curses.COLOR_BLACK) # rule list
curses.init_pair(5, curses.COLOR_MAGENTA, curses.COLOR_BLACK) # footer / help bar
# --------------------------------------------------------------
# Main UI loop
# --------------------------------------------------------------
def main(self, stdscr):
curses.curs_set(0)
stdscr.nodelay(False)
stdscr.keypad(True)
self.init_colors()
while True:
stdscr.clear()
max_y, max_x = stdscr.getmaxyx()
list_width = int(max_x * 0.65)
sidebar_x = list_width + 2
# -------------------------
# TITLE + HEADER
# -------------------------
stdscr.attron(curses.color_pair(1))
stdscr.addstr(0, 0, "YARA ZIP SCANNER (FF EDITION)")
stdscr.addstr(1, 0, f"Archive: {self.zip_path}")
stdscr.attroff(curses.color_pair(1))
stdscr.addstr(3, 0, "Sel File".ljust(list_width - 1), curses.color_pair(1))
stdscr.addstr(3, sidebar_x, "Matched Rules", curses.color_pair(1))
# -------------------------
# SCROLL LOGIC
# -------------------------
visible_rows = max_y - 6
if self.idx < self.scroll:
self.scroll = self.idx
elif self.idx >= self.scroll + visible_rows:
self.scroll = max(0, self.idx - visible_rows + 1)
# -------------------------
# MAIN FILE LIST
# -------------------------
for screen_row in range(visible_rows):
match_idx = self.scroll + screen_row
if match_idx >= len(self.matches):
break
fname, _rules = self.matches[match_idx]
# Selected marker
marker = "[x]" if self.selected[match_idx] else "[ ]"
display = f"{marker} {fname}"
# Highlight current row
if match_idx == self.idx:
stdscr.addstr(
4 + screen_row,
0,
display[:list_width - 1],
curses.color_pair(2) | curses.A_BOLD
)
else:
color = curses.color_pair(3) if self.selected[match_idx] else curses.A_NORMAL
stdscr.addstr(4 + screen_row, 0, display[:list_width - 1], color)
# -------------------------
# SIDEBAR: MATCHED RULES
# -------------------------
if 0 <= self.idx < len(self.matches):
_, rule_list = self.matches[self.idx]
y = 4
for r in rule_list:
if y < max_y - 2:
stdscr.addstr(
y,
sidebar_x,
f"- {r}",
curses.color_pair(4)
)
y += 1
# -------------------------
# FOOTER / HELP BAR
# -------------------------
stdscr.attron(curses.color_pair(5))
stdscr.addstr(
max_y - 1,
0,
"↑/↓ Move SPACE Toggle ENTER Extract A All D None ESC Quit".ljust(max_x - 1)
)
stdscr.attroff(curses.color_pair(5))
# -------------------------
# INPUT HANDLING
# -------------------------
key = stdscr.getch()
if key in (curses.KEY_UP, ord('k')):
self.idx = max(0, self.idx - 1)
elif key in (curses.KEY_DOWN, ord('j')):
self.idx = min(len(self.matches) - 1, self.idx + 1)
elif key == ord(' '):
# Toggle file or whole directory
current_path = self.matches[self.idx][0]
new_val = not self.selected[self.idx]
if is_folder(current_path):
self.selected[self.idx] = new_val
for i, (path, _) in enumerate(self.matches):
if path.startswith(current_path) and not is_folder(path):
self.selected[i] = new_val
else:
self.selected[self.idx] = new_val
elif key == ord('a'):
self.selected = [True] * len(self.matches)
elif key == ord('d'):
self.selected = [False] * len(self.matches)
elif key in (curses.KEY_ENTER, 10, 13):
selected_files = [self.matches[i][0] for i, v in enumerate(self.selected) if v]
if selected_files:
self.extract_popup(stdscr, selected_files)
else:
curses.flash()
elif key == 27: # ESC
break
# --------------------------------------------------------------
# Extraction popup window
# --------------------------------------------------------------
def extract_popup(self, stdscr, files):
max_y, max_x = stdscr.getmaxyx()
box_w = 50
box_h = 7
box_y = (max_y - box_h) // 2
box_x = (max_x - box_w) // 2
safe_extract(self.zip_path, files, self.out_dir)
win = curses.newwin(box_h, box_w, box_y, box_x)
win.border()
win.addstr(1, 2, "Extraction Complete!", curses.color_pair(2) | curses.A_BOLD)
win.addstr(3, 2, f"{len(files)} files saved to:")
win.addstr(4, 2, self.out_dir[:box_w - 4], curses.color_pair(3))
win.addstr(5, 2, "Press any key to continue.")
win.refresh()
win.getch()
# ======================================================================
# MAIN
# ======================================================================
def main():
parser = argparse.ArgumentParser()
parser.add_argument("zip_path")
parser.add_argument("rules_path")
parser.add_argument("-o", "--out", default="extracted")
args = parser.parse_args()
rules = yara.compile(filepath=args.rules_path)
matches = scan_zip_for_matches(args.zip_path, rules)
if not matches:
print("[*] No matches found.")
return
ui = CursesUI(matches, args.zip_path, args.out)
ui.run()
if __name__ == "__main__":
main()
The code will also be on my GitHub. You may see in the screenshots that I originally named the program py_yara.py but opted to change that to rift.py since everything needs an acronym (Rules Initiated Forensic Triage).
Additional Thoughts:
While writing the rules I noted that searching for the database by string would also return the -wal and -shm files. This could be important depending on the case but could also cause clutter on the return. An option is to use regex to only return the database, i.e. instead of "L360EventStore.db" use /L360EventStore.db$/ which uses the $ as the end of the file. If this seems like something you would toggle on and off frequently it may benefit you to write the rule for /L360EventStore.db/ which would return the -wal and -shm and then all you would need to do is add the $ at the end if you want to exclude these.












Comments