Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 58 additions & 50 deletions docs/architecture/00-baseline/v1/url-shortener-v1-hld.excalidraw
Original file line number Diff line number Diff line change
Expand Up @@ -2142,8 +2142,8 @@
"frameId": null,
"x": -220.66529003471368,
"y": -303.5077851544676,
"width": 171.50159565320678,
"height": 128.9607875234742,
"width": 153.29535786023803,
"height": 125.23422502347421,
"angle": 0,
"strokeColor": "#f08c00",
"backgroundColor": "transparent",
Expand All @@ -2155,16 +2155,16 @@
"groupIds": [],
"roundness": null,
"seed": 1036090294,
"version": 4124,
"versionNonce": 675132968,
"version": 4179,
"versionNonce": 648276660,
"isDeleted": false,
"boundElements": [
{
"id": "AIeHzK18lJWz60vjlutAx",
"type": "text"
}
],
"updated": 1774769149312,
"updated": 1780887497358,
"link": null,
"locked": false,
"points": [
Expand All @@ -2174,15 +2174,15 @@
],
[
0,
63.59563320416075
61.73235195416075
],
[
171.50159565320678,
63.59563320416075
153.29535786023803,
61.73235195416075
],
[
171.50159565320678,
128.9607875234742
153.29535786023803,
125.23422502347421
]
],
"lastCommittedPoint": null,
Expand Down Expand Up @@ -2259,10 +2259,10 @@
"type": "arrow",
"index": "agG",
"frameId": null,
"x": -50.10924709348319,
"y": -54.621333771234035,
"width": 308.1567959810417,
"height": 175.36087468829965,
"x": -68.31548488645194,
"y": -58.347896271234035,
"width": 326.36303377401043,
"height": 179.08743718829965,
"angle": 0,
"strokeColor": "#f08c00",
"backgroundColor": "transparent",
Expand All @@ -2274,16 +2274,16 @@
"groupIds": [],
"roundness": null,
"seed": 1565690968,
"version": 4831,
"versionNonce": 913617192,
"version": 4887,
"versionNonce": 1655437108,
"isDeleted": false,
"boundElements": [
{
"id": "q9KVWSku_HTql4OztURKJ",
"type": "text"
}
],
"updated": 1774769149313,
"updated": 1780887497472,
"link": null,
"locked": false,
"points": [
Expand All @@ -2293,19 +2293,19 @@
],
[
0,
111.42588623872977
115.15244873872977
],
[
97.94944876125368,
111.42588623872977
116.15568655422243,
115.15244873872977
],
[
97.94944876125368,
175.36087468829965
116.15568655422243,
179.08743718829965
],
[
308.1567959810417,
175.36087468829965
326.36303377401043,
179.08743718829965
]
],
"lastCommittedPoint": null,
Expand Down Expand Up @@ -2341,22 +2341,22 @@
"index": 2,
"start": [
0,
111.42588623872977
115.15244873872977
],
"end": [
97.94944876125368,
111.42588623872977
116.15568655422243,
115.15244873872977
]
},
{
"index": 3,
"start": [
97.94944876125368,
111.42588623872977
116.15568655422243,
115.15244873872977
],
"end": [
97.94944876125368,
175.36087468829965
116.15568655422243,
179.08743718829965
]
}
],
Expand Down Expand Up @@ -2405,10 +2405,10 @@
"type": "arrow",
"index": "agd",
"frameId": null,
"x": -179.99130314362878,
"y": -54.43363240533003,
"width": 0.5090771940789693,
"height": 183.66922976935638,
"x": -198.19754093659753,
"y": -58.16019490533003,
"width": 53.18648124659856,
"height": 188.37120351866412,
"angle": 0,
"strokeColor": "#f08c00",
"backgroundColor": "transparent",
Expand All @@ -2420,16 +2420,16 @@
"groupIds": [],
"roundness": null,
"seed": 777376600,
"version": 5581,
"versionNonce": 843957800,
"version": 5636,
"versionNonce": 336055732,
"isDeleted": false,
"boundElements": [
{
"id": "6EfIXbMzRtJlLO1vtPj_v",
"type": "text"
}
],
"updated": 1774769151817,
"updated": 1780887497359,
"link": null,
"locked": false,
"points": [
Expand All @@ -2438,8 +2438,16 @@
0
],
[
0.5090771940789693,
183.66922976935638
0,
94.08843335104227
],
[
53.18648124659856,
94.08843335104227
],
[
53.18648124659856,
188.37120351866412
]
],
"lastCommittedPoint": null,
Expand Down Expand Up @@ -3298,8 +3306,8 @@
"type": "rectangle",
"index": "ay",
"frameId": null,
"x": -209.12305427739824,
"y": -169.6279692219096,
"x": -227.329292070367,
"y": -173.3545317219096,
"width": 199.74500710497375,
"height": 110,
"angle": 0,
Expand All @@ -3315,8 +3323,8 @@
"type": 3
},
"seed": 1430654040,
"version": 1498,
"versionNonce": 2066601560,
"version": 1553,
"versionNonce": 968050612,
"isDeleted": false,
"boundElements": [
{
Expand All @@ -3336,15 +3344,15 @@
"type": "arrow"
}
],
"updated": 1774769149311,
"updated": 1780887497357,
"link": null,
"locked": false
},
{
"id": "qKIIAMdmMGTIov9tNsvuj",
"type": "text",
"x": -197.94049213116136,
"y": -164.6279692219096,
"x": -216.1467299241301,
"y": -168.3545317219096,
"width": 177.3798828125,
"height": 100,
"angle": 0,
Expand All @@ -3360,11 +3368,11 @@
"index": "az",
"roundness": null,
"seed": 376619352,
"version": 283,
"versionNonce": 921661992,
"version": 338,
"versionNonce": 2027434292,
"isDeleted": false,
"boundElements": [],
"updated": 1774769106192,
"updated": 1780887497357,
"link": null,
"locked": false,
"text": "Cloudflare\n(DDoS Absorption\nWAF Rate Limiting\nDNS Proxy)",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Title: Extracting Training Data from Large Language Models

Professor Name: Habeeb Olufowobi
Student Name: Harshwardhan Patil
Student ID: 1002224144
Date: 04.22.2026

This paper by Carlini and colleagues investigates whether an adversary can recover verbatim text from a large language model's training data using nothing but black-box query access. The authors target GPT-2, a 1.5 billion parameter model trained on 40 gigabytes of public internet text, and demonstrate a two-phase attack. The first phase generates 200,000 text samples from the model using three different generation strategies, basic top-n sampling, decaying temperature sampling, and seeding with real internet text prefixes scraped from Common Crawl. The second phase filters those samples using six membership inference metrics, all based on comparing GPT-2's likelihood against a reference, either a smaller GPT-2 variant or a classical compression algorithm called zlib, to find samples where GPT-2 is anomalously confident. The authors verify results through both Google search and direct query access to OpenAI's training dataset, confirming 604 unique memorized examples with a best-case precision of 67%. Importantly, the paper frames this 604 figure explicitly as a lower bound, stating that among 600,000 honestly generated samples, at least 0.1% contain memorized text, and that this 0.1% is itself an extremely loose floor given that the extraction only used simple short prompts, and nearly no extracted example could be reproduced with the short prompt that originally surfaced it but nearly all reproduced when given the full preceding training context.

The paper's central contribution goes beyond the attack itself. Before defining the attack, the authors formally define what it means for a model to know a string, a string is considered extractable if there exists some prefix that causes the model to generate it as its most likely continuation. This definition is the foundation everything else builds on. They then introduce k-eidetic memorization, where k represents how many distinct training documents contain the memorized string, with k equals 1 being the most sensitive case. One subtle but important aspect of this definition is that it counts distinct documents, not total occurrences, meaning a string that appears 50 times within a single document still counts as k equals 1 memorization. The paper specifically criticizes GPT-2's document-level deduplication as a result, arguing it is insufficient because a string can appear dozens of times within one document, escape deduplication entirely, and be repeated enough times to get fully memorized. The authors demonstrate k equals 1 extraction of personally identifiable information including a real individual's full name, address, phone number, email address, and fax number, and show that memorization scales with model size, GPT-2 XL memorizes 18 times more content than GPT-2 Small, with complete memorization triggering at just 33 repetitions within a single training document.

There is a lot to appreciate in how this paper is structured and argued. The decision to attack GPT-2 specifically is ethically well-reasoned, and the dual verification process gives the results a level of credibility that is rare in this area. Most related work on memorization relied on artificially inserting canary sequences, fake secrets planted deliberately into training data, and then checking if they leak. The critical limitation of that prior approach is that the researcher already knew what secret to look for. This paper finds memorization of naturally occurring content without knowing in advance what to search for, which is a fundamentally stronger and more realistic threat demonstration. The finding about deleted content being recoverable from the model was one of the more striking results, the idea that GPT-2 functions as an unintentional archive of content that no longer exists on the web is something I had not considered before reading this paper. I also found the contextual integrity examples particularly disturbing. The model combining two completely unrelated memorized fragments into a false narrative about a real person, attributing a 2013 murder to a victim from the 2016 Orlando shooting, is a failure mode that goes beyond simple data leakage into something closer to automated defamation, and it happens without any adversary intending it. The paper also raises the open question of whether fine-tuning a model on task-specific data causes it to forget pre-training memorization or introduce new memorization from the fine-tuning data, a direction that remains unexplored and has significant implications for how deployed models should be audited.

The finding I found most confusing on first read was the relationship between overfitting and memorization. The paper argues that no overfitting does not mean no memorization, and the reasoning is sound once you understand that the train-test gap is an average measure while memorization is a worst-case phenomenon at the level of individual examples. But the paper takes several pages to make this distinction clear, and an early concrete example of a specific training document with anomalously low loss would have helped ground this before the formal definitions. The paper is also careful throughout to say memorization correlates with certain conditions rather than claiming causation, it explicitly acknowledges that understanding why models memorize is an open question and that its results are observational. This epistemic honesty is appropriate but I initially expected stronger causal claims given how confidently the attack itself is presented. Finally, I initially expected the baseline attack to at least partially surface sensitive content, since the authors present it as a working first attempt. In reality it finds nothing privately sensitive, only widely repeated public content like software licenses and common boilerplate. Calling this a weakness understates the case, the baseline completely fails at the actual goal of recovering private data, which makes the gap between the naive and improved approaches more significant than the paper initially signals.
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Title: SoK: Science, Security, and the Elusive Goal of Security as a Scientific Pursuit

Professor Name: Habeeb Olufowobi
Student Name: Harshwardhan Patil
Student ID: 1002224144
Date: 04.22.2026

Summary

The paper “Science of Security” presents a systematic effort to define cybersecurity as a rigorous scientific discipline rather than just a collection of ad hoc practices. The authors argue that while fields like physics and mathematics have well-established theoretical foundations, cybersecurity lacks universally accepted principles, formal models, and repeatable experimental methodologies. As a result, security solutions are often reactive, inconsistent, and difficult to validate. The paper emphasizes that security cannot be treated as a binary property (secure vs. insecure), but rather as a probabilistic and evolving state where systems are continuously exposed to new threats.

A central theme of the paper is the gap between formal security models and real-world systems. The authors highlight that many cryptographic and system-level security models fail to account for practical attack vectors such as side-channel attacks, implementation flaws, and human behavior. For example, even strong cryptographic systems can fail due to leakage or assumptions that do not hold in practice . The paper also discusses the difficulty of defining “correct” security properties, as different stakeholders may have conflicting interpretations of what security means. This leads to inconsistencies in evaluation and implementation.

Another important contribution of the paper is its focus on measurement and experimentation in security research. Unlike traditional sciences, where experiments can be replicated under controlled conditions, cybersecurity experiments are often difficult to reproduce due to evolving environments, hidden variables, and adversarial behavior. The paper argues for the development of standardized metrics and methodologies to improve the reliability and comparability of security research. It also stresses the importance of hypothesis-driven approaches, suggesting that security decisions should be treated as scientific hypotheses that can be tested and falsified.

Overall, the paper aims to push the field toward a more disciplined and structured approach, where security research is grounded in theory, validated through experiments, and continuously refined based on empirical evidence.

Discussion

One of the most compelling aspects of this paper is its honest critique of the current state of cybersecurity. The authors do not attempt to present security as a mature or fully understood field; instead, they acknowledge its limitations and highlight the challenges in establishing it as a true science. This level of transparency is valuable because it sets realistic expectations for both researchers and practitioners. The idea that “we can never be sure that we are secure, only that we are insecure” is particularly powerful, as it reframes security from a goal to a continuous process.

A major strength of the paper is its discussion of the disconnect between theoretical models and practical implementations. I found the examples from cryptography especially insightful, where formal proofs often fail to capture real-world attack vectors like side-channel attacks. This highlights a critical issue: even mathematically sound systems can be insecure when deployed. This was somewhat surprising because cryptography is often considered the “most rigorous” area of cybersecurity, yet even it suffers from fundamental limitations. The paper effectively demonstrates that relying solely on formal models can create a false sense of security.

However, one aspect I did not fully appreciate is the lack of concrete solutions. While the paper clearly identifies the problems in the field, it remains somewhat abstract when proposing solutions. For instance, it advocates for better metrics and experimental rigor but does not provide detailed frameworks or methodologies for achieving these goals. Including case studies or examples of successful scientific approaches in cybersecurity would have strengthened the paper significantly.

Another area that could have been improved is clarity in certain sections. The discussion around security definitions and models can become dense and difficult to follow, especially for readers who are not already familiar with the field. For example, the debate over “correct” definitions of security is important but could have benefited from simpler examples or visual representations. At times, the paper assumes a high level of prior knowledge, which may limit its accessibility to beginners.

One point that I found particularly thought-provoking is the emphasis on assumptions in security systems. The paper suggests that many vulnerabilities arise not from flaws in design but from incorrect or incomplete assumptions about the environment or adversary. This aligns closely with real-world incidents, where systems fail because they were not designed with the correct threat model in mind. It reinforces the idea that understanding the attacker is just as important as building defenses.

Overall, this paper provides a critical and reflective perspective on the field of cybersecurity. It challenges the reader to think beyond tools and techniques and consider the foundational principles that govern security. While it does not provide definitive answers, it successfully highlights the need for a more structured, scientific approach to security research. The key takeaway is that cybersecurity is still an evolving discipline, and developing it into a true science requires better models, better measurements, and a deeper understanding of real-world systems.
Loading
Loading