Table of Contents

how AI sees images and reads sequences — two tools, one post
#

not all data looks the same. images are grids. logs are sequences. deep learning has a different tool for each.

feed an image into a standard neural network — it loses all spatial information. feed a log sequence into one — it loses all temporal context. wrong tool, wrong result.

two architectures were built to solve this. CNNs for grids. RNNs for sequences.

CNNs — convolutional neural networks
#

built for grid-like data. images, screenshots, binary visualisations.

three layer types working together:

convolutional layers — slide a small filter across the input. detect local patterns. edges, textures, shapes. each filter produces a feature map highlighting where that pattern appears.
pooling layers — shrink the feature maps. keep the important signal, reduce noise and computation.
fully connected layers — take the extracted features, make the final classification decision.

hierarchical learning:

early layers detect edges and textures
middle layers detect shapes and patterns
deeper layers detect complex structures and objects

security application — malware visualisation. convert a binary file to a grayscale image. different malware families produce visually distinct patterns. CNN classifies the family from the image. fast, effective, no code execution required.

RNNs — recurrent neural networks
#

built for sequential data. text, logs, network traffic, time series.

standard networks process inputs independently. RNNs maintain a hidden state — memory of previous inputs. each step considers current input and what came before.

security application — log sequence analysis. user behaviour over time. network connection patterns. the sequence matters as much as the individual event.

the vanishing gradient problem
#

standard RNNs struggle with long sequences. gradients shrink as they travel back through time steps. early inputs stop influencing the model. long term dependencies get lost.

two solutions were built:

LSTM — long short term memory
#

three gates control information flow:

input gate — what new information to store
forget gate — what old information to discard
output gate — what to pass to the next step

memory cell persists important context across long sequences. good for long documents, extended user sessions, prolonged attack sequences.

GRU — gated recurrent unit
#

simpler version of LSTM. two gates instead of three. faster to train. comparable performance in most tasks.

update gate — how much previous state to keep
reset gate — how much previous state to combine with current input

why this matters in security
#

CNNs power malware image classification and screenshot analysis tools. RNNs and LSTMs power user behaviour analytics, log anomaly detection, and sequence based threat detection.

when your UBA tool flags a behavioural sequence — there’s likely an LSTM underneath tracking that session over time.

next up — generative AI and LLMs. the AI everyone is talking about. let’s open the hood.

which feels more relevant to your current SOC environment — image based analysis or sequence based detection?

took ai help to clean up typos. my brain works faster than my fingers. xd

next up: AI Series #10 — “the AI everyone is talking about — how does it actually work” back to series index

how AI sees images and reads sequences — two tools, one post#

CNNs — convolutional neural networks#

RNNs — recurrent neural networks#

the vanishing gradient problem#

LSTM — long short term memory#

GRU — gated recurrent unit#

why this matters in security#