Automating Intelligence - Seeds of Superintelligence in Cybersecurity - A 12/21/20 Message to the FBI
Part 9 of Artificial General Intelligence (And Superintelligence) And How To Survive It
Before we continue, I should explain my 12/21/20 message to the FBI on how US Federal cyber could be secured by automating it - via AI.
I will explain… by sharing it in full.
Given the pace of AI advancement, and statements by the US, UK and NATO indicating they have automated their cyberdefenses through artificial intelligence…
We can safely assume the following information is no longer state of the art 2 1/2 years later.
The following should provide insights into how cyber could be automated even before large language models had proven themselves. This is far from the only area in which AI has been evident in the last 8 years. Evolutionary algorithms in psychological warfare have arguably overshadowed everything, and remain an ongoing threat.
But that is a discussion for another day.
For now, let’s consider how even a casual observer could generate tools to address dire threats, given sufficient motivation - not only the immediate issue of influence campaigns and psychological warfare exercises throughout social media and across the world…
But the looming, existential threat of artificial intelligence out of control, whether at its own direction or that of reckless or destructive human beings.
As a side note, the following is broken into three parts because the FBI website can only accept so much per message, and there are a few edits in brackets, like so - <a>.
This copy should also serve as a warning for any organization, particularly any of size or significance, which has not already upgraded its defenses to deal with the AIs already loose in the world.
Data Mining Widespread Government and Corporate Hacking
Part 1 of 3
To Whom It May Concern:
Given reports of widespread hacking of the US government and corporations associated with SolarWinds and possibly other issues, I am writing to discuss how generative adversarial networks (GANs) could be used not only to find these intrusions automatically and continuously, but also to search our networks for further intrusions and built-in flaws, and to unceasingly refine our ability to detect infiltration or system weaknesses.
I have written to the FBI on a number of subjects, and the hackers involved with this cyberattack appear to have taken safeguards to thwart a few of my previous suggestions for countering these activities.
Fortunately, there are more-advanced ways to defeat their efforts. These tools, once deployed, may actually prove easier to use as well, especially on a global scale.
This particular set of intrusions is notorious for the degree to which its perpetrators have been trying to cover their tracks. However, the government has uncovered enough instances to serve as a vast and ideal set of training data for any GANs, which will enable us to look beyond past red flags and known methods of evasion.
On roughly 9/30/18, 12/30/18 and 1/2/19, I covered using web-crawler bots and passive email collection to find conventional global malware, using convergences and repeating patterns in data tags and DNS data trails to unearth further activity, and how all of this might converge to reveal online illegalinfluence operations, a subject discussed further on 1/2/20.
On 1/3/20 I covered ways to accelerate and automate the analysis of malware and botnets.
As I wrote to you on 10/23/19:
“Generative adversarial networks are a method of machine learning commonly applied to problems such as enhancing images and detecting deep fakes. Two neural networks challenge each other in a game, each trying to “outwit” the other. They are given a training set as an example, and subsequently learn to generate new data sets meeting the same statistical parameters as the training set.
“Essentially, a generative network produces candidates and a discriminative network evaluates them. By competing against each other, both neural networks hone their skills and become increasingly capable.”
GANs could be used to sift code and activities on an immense scale, looking for anomalies and intrusions such as changes to code, the sources of those changes, where data goes to that it should not and where commands are coming from that they should not. Changes to code obviously represent less data than the entire system, but for our purposes both benign and malicious code are useful data sets for training our GANs.
We will be including not only all known markers signifying intrusion, but all data points involved with each instance. The goal is not merely to detect known red flags but to let the GAN find others, including those showing up in subtle changes or multiple differences emerging simultaneously.
Training GANs to address this problem has some obvious advantages. The Federal government owns these networks and the data. They have at least several months of data on these intrusions, identified and otherwise, to work with. Anything uncovered already reveals not only whatever markers exposed it, but all data points involved in each intrusion exposed, and each change made. This is not just about finding those hints in past activities, though obviously directly automating searches for those red flags has immediate value. Rather, it is about sifting for all possible indicators, even those a human mind would have difficulty uncovering, even given considerable time, and to keep automatically searching and refining that search, internally.
Another focus should be the supply chain itself – not just commercial software, programming by contractors and in-house solutions, but also “shareware” such as API modules containing built-in flaws or simply bad programming. GANs could be built to analyze these for weaknesses in a variety of circumstances. Each GAN might focus on a particular problem set – software updates with intentional or unknowing flaws, modules with inherent security gaps, weaknesses in specific hardware, vulnerabilities of weapons or other hackable equipment and even assessing prospective new programs as a whole.
First, we need very large datasets ideally suited for training our neural networks, which, for our initial problem, will be the code and data already owned by the Federal government, which cybersecurity experts have already been pouring over. Corporate networks will benefit from this effort, but Federal code and data is where we can work unrestricted, so that is where we will develop the GANs.
Though of course Federal cybersecurity officials will have a much more exhaustive list of factors a GAN could analyze, for the purposes of this message, CISA’s guidance contains enough details to get us started.
https://us-cert.cisa.gov/ncas/alerts/aa20-352a
I would quote all of these, but there are clearly a host of relevant details, with which the agencies are already intimately familiar.
One glaring example is the reputed use of steganography to transmit confidential data. This is an ideal problem which could be broken out from everything else and targeted with a GAN looking only for steganography, which could then act independently or in support of other convolutional neural networks (CNNs) such as their fellow GANs. In the case of steganography, there are doubtless plentiful data sets including legitimate images and ones with steganography created with various tools and levels of skill, and also plenty of legitimate data traffic along with any steganography examples unearthed by Fire Eye or other researchers. Particularly to the degree this data is actually concealed in normal images, it is really a classic problem for a GAN. The GAN focused on this issue can be trained by long-standing methods, and put at the disposal of cybersecurity experts and other GANs combing the databases.
Another point of interest is seeing the malware avoiding malware analysis sandboxes.
To quote:
“According to FireEye, the malware also checks for a list of hard-coded IPv4 and IPv6 addresses—including RFC-reserved IPv4 and IPv6 IP—in an attempt to detect if the malware is executed in an analysis environment (e.g., a malware analysis sandbox); if so, the malware will stop further execution.
Additionally, FireEye analysis identified that the backdoor implemented time threshold checks to ensure that there are unpredictable delays between C2 communication attempts, further frustrating traditional network-based analysis.”
To a degree, we are trying to turn every Federal network into a malware analysis sandbox, both in real time and in retrospect. Presumably, though, a faux target with no critical data – or nothing which has not already been compromised or rewritten – could be used as a makeshift sandbox. An isolated computer cluster, for example. But again, the point is really <to> hone your GANs to the point they can detect and thwart intrusions in real time, and more importantly find and seal off all vulnerabilities and track all intrusions back to their sources.
To add three more quotes:
“Analyze stored network traffic for indications of compromise, including new external DNS domains to which a small number of agency hosts (e.g., SolarWinds systems) have had connections.”
“This binary, once installed, calls out to a victim-specific avsvmcloud[.]com domain using a protocol designed to mimic legitimate SolarWinds protocol traffic. After the initial check-in, the adversary can use the Domain Name System (DNS) response to selectively send back new domains or IP addresses for interactive command and control (C2) traffic. Consequently, entities that observe traffic from their SolarWinds Orion devices to avsvmcloud[.]com should not immediately conclude that the adversary leveraged the SolarWinds Orion backdoor. Instead, additional investigation is needed into whether the SolarWinds Orion device engaged in further unexplained communications. If additional Canonical Name record (CNAME) resolutions associated with the avsvmcloud[.]com domain are observed, possible additional adversary action leveraging the back door has occurred.”
“The adversary is making extensive use of obfuscation to hide their C2 communications. The adversary is using virtual private servers (VPSs), often with IP addresses in the home country of the victim, for most communications to hide their activity among legitimate user traffic. The attackers also frequently rotate their “last mile” IP addresses to different endpoints to obscure their activity and avoid detection.
“FireEye has reported that the adversary is using steganography (Obfuscated Files or Information:
Steganography [T1027.003]) to obscure C2 communications.[3] This technique negates many common defensive capabilities in detecting the activity. Note: CISA has not yet been able to independently confirm the adversary’s use of this technique.
“According to FireEye, the malware also checks for a list of hard-coded IPv4 and IPv6 addresses—including RFC-reserved IPv4 and IPv6 IP—in an attempt to detect if the malware is executed in an analysis environment (e.g., a malware analysis sandbox); if so, the malware will stop further execution.
Additionally, FireEye analysis identified that the backdoor implemented time threshold checks to ensure that there are unpredictable delays between C2 communication attempts, further frustrating traditional network-based analysis.
“While not a full anti-forensic technique, the adversary is heavily leveraging compromised or spoofed tokens for accounts for lateral movement. This will frustrate commonly used detection techniques in many environments. Since valid, but unauthorized, security tokens and accounts are utilized, detecting this activity will require the maturity to identify actions that are outside of a user’s normal duties. For example, it is unlikely that an account associated with the HR department would need to access the cyber threat intelligence database.
“Taken together, these observed techniques indicate an adversary who is skilled, stealthy with operational security, and is willing to expend significant resources to maintain covert presence.”
While the above details may be complicate the situation, the very fact that we know them indicates we have relevant data regarding these activities which may yield additional weaknesses and means of tracking them as we analyze their work further. In particular, GANs specialized in detecting falsified DNS addresses and other known means of evasion may find other cues for tracking this activity, not just inside the networks, but back through the Internet. For example, if we know what servers they were coming in through, is it possible somewhere in that chain of computers there are logfiles which remain unaltered, or which show forensic signs of tampering which might provide relevant clues?
Remember, we should have not only the instances of surreptitious activity discovered in the last several months of records, but a vast amount of legitimate activity, from during and before the hacks. And key to this work is the degree to which GANs can be honed for a particular task and then carry it out with a speed and precision teams of human beings with conventional tools would be hard pressed to match.
We can and will use human expertise. But there is no point to handcrafting algorithms or having individuals personally sift every line of code – a task so vast it would defeat any search, no matter how ambitious.
Data Mining Widespread Government and Corporate Hacking
Part 2 of 3
But if instead of exhausting our people looking for specific indicators and handcrafting programs to search for each hint of malicious activity in turn, we may be able to automate the entire solution and turn the scale of the problem from an insurmountable barrier into a harvest of valuable evidence.
Similarly, API modules can not only be put in a sandbox and tested, but GANs can be refined by vying to create the most undetectable flaws versus finding the most subtle and invisible of weaknesses. Once we have such tools, there is arguably an imperative for sifting prevalent Internet shareware and commercial software for zero-day exploits and other issues.
The rest of this message includes excerpts from those previous messages which may prove useful in this endeavor.
From my 1/3/20 message on methods to accelerate and automate the analysis of malware and botnets:
-----
The following message addresses an imminent threat and opportunity in cybersecurity – malware and security programs evolving continuously at hyperspeed through a combination of evolutionary algorithms, enhanced processing power and convolutional neural networks (CNNs) such as generative adversarial networks (GANs). In this we will be examining both conventional malware and techniques for reprogramming neural networks simply by altering the data they train on.
Even the convolutional neural networks of GANs can be dramatically enhanced or thwarted, based upon the training sets they have access to and the hardware they are running on, or the degree to which mere inputs of data, as opposed to actual malware, can hack them. One advantage we can provide ours is to give them data their real-world adversaries are unaware of.
These unique vulnerabilities, particular with regards to being hacked by data inputs, are another reason to test neural networks furiously, especially ones with any significant responsibilities (or processing resources) in any way exposed to potential public manipulation.
This is How You Hack A Neural Network
Adversarial Reprogramming of Neural Networks
https://arxiv.org/abs/1806.11146
“Deep neural networks are susceptible to \emph{adversarial} attacks. In computer vision, well-crafted perturbations to images can cause neural networks to make mistakes such as confusing a cat with a computer. Previous adversarial attacks have been designed to degrade performance of models or cause machine learning models to produce specific outputs chosen ahead of time by the attacker. We introduce attacks that instead {\em reprogram} the target model to perform a task chosen by the attacker---without the attacker needing to specify or compute the desired output for each test-time input. This attack finds a single adversarial perturbation, that can be added to all test-time inputs to a machine learning model in order to cause the model to perform a task chosen by the adversary---even if the model was not trained to do this task. These perturbations can thus be considered a program for the new task. We demonstrate adversarial reprogramming on six ImageNet classification models, repurposing these models to perform a counting task, as well as classification tasks: classification of MNIST and CIFAR-10 examples presented as inputs to the ImageNet model.”
For obvious reasons, being able to anticipate adversarial attacks on neural networks, especially those required to take in data from non-curated sources, is of utmost importance, and is being able to thwart these assaults.
To do this, take a sample of the network programming and run it at accelerated speed on effectively faster processors by sourcing it to a supercomputer or a more advanced and more tightly integrated network (combining faster processing with reduced latency by putting all the networked computers on one site, ideally right on top of each other). You can draw these network samples from botnets, from backup or mirror servers storing instances of the program, or simply seizing them either online or in the real world.
Use evolutionary algorithms to test stimuli to reset or reprogram the network. Then use in the wild or replace components with modified hardware carrying “updated” software to infect the original hostile network. Data can be updated in real time to regulate/drive the intended effect.
Similarly run high-speed evolutionary algorithm tests of malware under controlled, isolated conditions to superharden defenses against all malware threats and to enhance detection and mitigation. A simple means of doing so would be to take a multitude of existing malware programs and use them in an air gapped system running various firewalls and other conventional and non-conventional defenses. Break up these malware programs into assorted pieces, recombine them and deploy them against those defenses, check the results and combine the most successful and test again.
Simultaneously, look at the key elements of programs hunting these systems, break them down into their key components and again deploy, test and recombine in an endless cycle, looking for the most effective results.
Then run both of these experiments at high speeds in what will be essentially a GAN, or a pair of them (or several pairs).
As I wrote to you on 10/23/19:
“Generative adversarial networks are a method of machine learning commonly applied to problems such as enhancing images and detecting deep fakes. Two neural networks challenge each other in a game, each trying to “outwit” the other. They are given a training set as an example, and subsequently learn to generate new data sets meeting the same statistical parameters as the training set.
“Essentially, a generative network produces candidates and a discriminative network evaluates them. By competing against each other, both neural networks hone their skills and become increasingly capable.”
We might have to break out specific goals and work more limited sets of tested malware initially. But eventually we could make use of the vast library of programs and exploits that is the Internet – a space which is also a vast, real-time testing ground for many more – to outpace the ordinary pace of development.
<Redacted>
A particularly skillful means of deploying any exceptional benefits from this practice – particularly beyond the most secure government systems under its protection – would be to find ways to stimulate more conventional defenses much as vaccines augment our natural immune system of the body by giving it an inert version of the virus to which it reacts and against whose specific code the body is thus strengthened. Integrated with public, allied and commercial cyber-defenses, these resources can work subtly to upgrade defenses without showing the true extent of their resources.
As I noted in my 1/1/20 message to the FBI:
“We, again, may be able to mobilize computer clusters isolated from the larger Internet to address embarrassingly parallel problems separately and feed the answers back to the CNN or GAN, and eventually we should be able to incorporate quantum search into this work.
“Another point to remember in keeping this modular and using multiple GANs and evidence sources is that every unanticipated and/or radically more powerful technology or dataset adds that much more to our capacity and makes all of this crowdsourced and other sensory data that much harder to counter.
Hence, exascale and zetascale computation, quantum search, massively parallelized analysis of embarrassingly parallel problems, evolutionary algorithms and certain sensors and evidence sources discussed either here or elsewhere are all resources we will want to incorporate where possible.”
As I once put it in a speculative fiction context “…a single "ping" alerting an invaded system of an ongoing attack, a sudden shutdown of a key communications hub in mid-incursion or basilisk hack, an involuntarily inserted or gift software patch that renders an obvious technique completely useless against the individual or organization.
“One factor often seen in these kinds of conflicts is an unspoken, ongoing assessment made by all of the relatively sane participants. Am I exposing too much of my resources, technology, tools and/or identity in this matter, and if so, is it worth it?”
I discussed a very simple, cursory way of managing a combined human and automated response to rapidly emerging, non-conventional threats in 2016:
Automating Everything - Cyber-Defense and Countering Pandemics – Managing Impossible Threats
http://futureimperative.blogspot.com/2016/06/automating-everything-cyber-defense-and.html
“Further, the basic method of monitoring multiple semi-autonomous artificial agents can be applied to other circumstances. For example, evolutionary algorithms may one day give us the ability to have a host of agents operating in defense of a computer network – perhaps even a national, multi-national or effectively global network. A sufficient advanced artificial intelligence or a team of human security experts or some combination thereof might maintain oversight and focus resources automatically when the normal, lower-level agents seemed challenged or outmatched. The triggers for this intervention would likely be numerous, and balanced by the need to avoid overreacting or overcommitting resources. But events such as an indication of clear data breaches in a sub-network, or encryption requiring intense supercomputing or quantum-computing analysis, or even a tricky political judgement call (such as repeated attacks seemingly sourced from the computers of a hostile nation or private organization) may require more advanced thinking or vastly greater processing power than might otherwise be available.
“Similarly, an AI and/or human team attempting to deal with a nanotech attack involving a multitude of differing and rapidly changing molecular machines might have to allow a degree of automatic response occur on the local level while gathering information, assessing successful and unsuccessful tactics and sourcing resources as appropriate. A bioweapons attack using a multitude of natural and/or artificial plagues might require a similar capacity to respond at both a conscious and unconscious level.
“The basic system would effectively be multi-layered. The simplest and most widespread elements of each system will collect information and begin any reflexive responses they have automatically – whether they are digital medical instruments, spectroscopic air readings, online objects in the Internet of Things, anti-virus programs running on individual PCs, tablets, smartphones and microcomputers, independent software security agents, or nanites or natural or artificial biological elements of a human or civilizational immune system.
Data Mining Widespread Government and Corporate Hacking
Part 3 of 3
“Hence, antivirus programs looped into this system would engage their usual resources, but also alert another node about attacks that were unusual in their frequency or nature, and pass on what was observed diagnostically as well the real or apparent source of the attacks. The node being contacted would collect information either to be passed on further or analyzed there. Once analyzed, the software would determine if there were a source – or a highly compromised network or set of networks – which could be cut off in response to the issue, or whose operators could be alerted to their vulnerable state.
That analysis would also help determine whether experts should be proactively notified of the issue. As the technology advanced, running genetic algorithms to see how existing security software could be immunized against a virus and its immediate variations would also be an option. The power to perform critical actions, such as contacting a hostile organization being used unknowingly as the host for attacks; determining the source of the attacks or actively going after that source would be left in the hands of the highest-level decision makers in the system.
“Alternatively, a doctor examines a patient with very bad case of the flu, and the strain is automatically analyzed and its DNA transmitted securely for at least partial sequencing. A cursory examination of the strain determines whether it is a normal strain of the flu, a more dangerous variant, or something altogether different from a known normal disease to <a> newly discovered natural virus to a bioweapon. Anything flagged as dangerous triggers a notification, but also begins whatever responses can be automated in terms of assessing the risks, geolocating incidents of infection and its vectors, developing a vaccine in a secure location and notifying all networked sensors and medical personnel to be aware of this specific threat. If information came about additional instances involving different diseases, for example in the case of a rapidly mutating virus, multiple diseases being released intentionally and/or artificial bioweapons, this information could be gathered and cross-referenced even as the work to deal with the existing health issues continued in the field. Dealing with nano-terrorism could be similar, though the first signs could come from security systems that carefully analyze and filter air noting unusual materials (or unusually structured materials) showing up in their continuous spectroscopic analysis of the solids, liquids and gases filtered out or other high-end security options.
Alternatively, as sensors and immense processing power become more ubiquitous, information collected for medical or scientific reasons may note such an intrusion, especially if the raw data (particularly data collected at a government’s behest, or with their primary funding) is used to help assess potential catastrophic threats (such as bio or nano-terrorism).
“If dealing with such a problem, the creation of countering agents or even immunizing species – viruses which have no effect other than triggering natural immune systems against dangerous plagues, or defensive nanites built to eliminate invasive ones – could occur locally under the direction of a central source or, especially in the case of furiously changing bio- or nano-threats, could transpire semiautonomously, with responses occurring within established parameters and data on the steps taken being transmitted to the oversight centers which would be given veto power on extreme measures (fire to purge contaminated buildings, releasing potentially uncontrollable, self-sustaining nanites into the wild) and which could intervene as needed, but which would otherwise allow each local actor respond to the best of their ability, albeit while fully informed of the best practices as yet uncovered.”
The above technique, especially in concert with other powerful tools, should prove useful in staying well ahead of malevolent actors in cyberspace, and of course I have already sent you further information on tracking and countering non-conventional threats such as weapons of mass destruction on 1/1/20.
-----
From the 12/30/18 message:
-----
At the end of this September I sent a few suggestions regarding how to track malware globally.
After some consideration, I believe there is one key point I should add to expand on that concept, which should dramatically enhance its value, which involves key information tags which could potentially be searched for on an immense scale in databases the government has access to.
My original message involved using web-crawler bots and passive email addresses and other identities to either actively trigger malware in the public domain or to receive it automatically, thereby enabling investigations as these emerged.
I am assuming any such investigation would involve the DNS databases, server logfiles, and any information stored on botnets engaged in these activities. But it occurs to me there are specific pieces of information which would be very telling, and I wanted to point them out.
If we find out about a large-scale malware operation of any kind, then tracking the stolen data and transmitted commands back to the source is obviously critical. Whether the information goes back directly or is bounced around other computers such as a botnet, there should be key URLs or other waypoints involved that will be revealing.
Clearly, if you know a particular computer is receiving stolen data, you will investigate other information showing the hallmarks of stolen information going to it. But what if specific relays, cutouts, hacked servers or full-scale botnets were involved?
Obviously, criminal and hostile-intelligence botnets should be thoroughly investigated for any activities they undertake. But if other specific chokepoints are facilitating malware transmissions, we may be able to find more valuable information.
First, look for places where many connections converge, such a single computer, a network under the control of an intelligence or organized-crime operation, servers meant to store hacked files or retransmit in a secure or altered format, or just botnets.
Obviously, we want to examine these. But what if key DNS URLs or partial tags showing up in this malware could flag related operations and allow us to take our existing database of malefactors and find others?
The convergence of other malware passing through the same locations is obvious. But what if specific tags show up again and again in whole or especially in part? Strings of digits long enough that their appearance is extremely unlikely to be a coincidence, and which will of course be double checked by humans, but which can be searched for by automated systems?
What if a particular malware programmer used not only certain computers repeatedly, but tended to repeat the use of particular strings of digits in their filepaths, URLs, public phishing copy and so forth?
Once we have these strings, we can search DNS databases for similar metadata cropping up, as well as other databases involving other revealing cues, such as email addresses, email subject lines, phishing ad copy and cryptocurrency public and private keys. Anything revealing which can be searched for automatically would be helpful.
One way is to search for the malware’s tags and data flags. Another is to automatically sift what is going to known destinations, chokepoints and botnets for any of these revealing data points.
If we can turn these databases into de facto listings of historic malware distributors and enablers, we may have probable cause and even extensive digital evidence on a host of crimes. As more malware is tracked, more computers are seized and more of these data strings are found, we will be able to continue adding to this archive of evidence and find an ever-growing multitude of perpetrators and their victims.
Obviously, human consideration and oversight are necessary. But any particularly massive convergence of these threads of data may indicate active operations, deserving of your attention.
Thank you, as always, for your work.
I should have something else very useful for you in the near future, which will hopefully prove far more powerful than any of the tools I have suggested thus far.
I am including part of my September, 2018 message below, for reference.
-----
In roughly mid-September, 2017, I offered some suggestions to the FBI on how cryptocurrency, botnets and other online instruments of the Darknet and the criminal underworld could be used to track crime and espionage on an unprecedented scale.
I am writing now because I have another powerful potential tool to offer law enforcement.
Incidents of hacking, including malware, have been increasingly used by transnational organized crime, terrorists and hostile nations to undermine the rule of law and democratic institutions.
Because of the complexities in tracking these activities and securing the necessary jurisdiction and authorization for warrants, intelligence intercepts, unmasking and so forth, this task becomes even more difficult for the individuals and institutions tasked with protecting us. The public-domain nature of cryptocurrency was of course key to my original message in 2017.
I have something similar, designed to ferret out even more data regarding hostile actors, and again making use of public-domain information likely requiring no warrants at all but enabling a vast and incredibly thorough scan of most of the heavily trafficked regions of the Internet.
The use of malware on emails and websites is no secret, and allows botnets and hackers access to a host of computers. What if we had a web-crawler bot that does nothing but check web pages like a search engine web crawler but in this case clicking available buttons and links as it goes and seeing what malware is triggered. The system behind it could then assess said malware and any organizations hosting it (perhaps deliberately) as most accessible pages taking significant traffic are by definition public domain and presumably require no warrant.
Another variant of this which could <be> employed simultaneously and even more cheaply, would be to filter out a multitude of legitimate but little used email accounts – possibly with actual names on them, depending on discriminating malware systems become – for the primary purpose of sitting in contact lists when malware programs sending phishing emails to everyone found on them. Not every program would trip this alarm, but once deployed and occasionally refreshed it would be a mainly passive system. If deployed broadly enough it would also serve as a continuous public-domain watch for would be threats.
The web-crawler could be done by the FBI or by an allied or similarly-aligned non-profit if some separation were required for legal reasons. Intelligence organizations could similarly sift the Internet and parse the data uncovered.
We could cross reference this information with what is known about the ownership of those websites, known botnets or rampant malware which might have inserted itself into the systems, information held by the Mueller investigation and other immense sources of evidence.
Examples include the de facto cryptocurrency map of criminal and espionage activities linked to easily traced cryptocurrency transfers, archived data and evidence from botnets and the offshore leaks database on money-laundering LLCs.
The FBI would be in a position to relentlessly track and thwart malware, but you can do so much more.
You can <trace> sources, vectors, strategies, tactics, ongoing operations, participants and intentions.
Let us consider using evolutionary algorithms for swapping out inputs to hack neural networks as you would populations in PSYOPS. Remember that disruption is easiest, but you also have the option of creating blind spots, enabling intrusions and data mining or re-tasking a neural network to do the work you assign it. Hidden triggers activated by specific stimuli enable the appearance of a fully functional network operating normally, while giving you the ability to disrupt, blind, shut down or takeover that network at will when tactically or strategically advantageous to do so.
Thanks for creating this substack, it is much more useful than Twitter for this type of info (FWIW I have over 60 of your Twitter threads bookmarked). I kept getting lost in this article as to what text was from the various emails, what was quoted from other emails, what was quoted from other sources, and what was your new commentary. Formatting such as indents, italics or headings would make this article clearer. Thanks again for your generosity sharing work over many years.