A digital store comprising of a method to store
digital data in live micro-organisms, and a method to selectively retrieve subsets of stored data, is disclosed.
Digital data is represented as a plurality of key-value pairs. The proposed
system stores copies of key-value pairs in a plurality of live micro-organisms. Upon presentation of a retrieval key, the proposed digital store retrieves the value associated with the retrieval key. Storage method for a key-value pair comprises of (a) mapping the key to a
gene that expresses a unique
fluorescent protein so that no two keys map to the same
gene, (b) encoding the key-value pair as base-pair sequences, (c) synthesizing
oligonucleotide chains from base-pairs for the key-value pair and the
gene, (d) synthesizing
recombinant DNA plasmids that have
oligonucleotide chains for the key-value pair, the gene, and two primers, as foreign
DNA inserts, (e) incorporation of
recombinant DNA plasmids into live micro-organisms, (f) isolation of live micro-organisms that have absorbed the
recombinant DNA plasmids, and (g)
safe storage of
population of live micro-organisms with embedded key-value pairs in a common
pool. Retrieval of the value paired with a key comprises of (a) taking as input the retrieval key, and mapping the key to the specific gene for
fluorescent protein, (b) taking a sample from the
safe storage pool that contains live micro-organisms embedded with key-value pairs, (c) isolating the live micro-organisms that have expressed the gene by using high-speed
fluorescence activated
cell sorting or
flow cytometry, (d) extracting
DNA from the recombinant
DNA plasmid in the isolated live micro-organisms, (d) selectively amplifying and sequencing only those DNA strands that contain the value for the key, and (e) decoding the base-pair sequence obtained after
DNA sequencing to yield the value associated with the retrieval key.We also disclose two important variations. The
first variation relates to the storage step. The recombinant DNA
plasmid is constructed to include additional non-fluorescent oligonucleotides and genes so that during the
data retrieval step, the live micro-organisms that have absorbed the said
plasmid can be sorted by
cell sorters based on parameters of individual cells such as
cell size, cell complexity,
cell phenotype,
cell structure,
cell function, and magnetic or electrical properties. The second variation relates to both storage and retrieval of key-value pairs with large values. To store such a key-value pair, the large value is split into smaller blocks so that a block can fit into a recombinant DNA plasmid, and a distinct pair of primers is used for each block. A block's primer pair is used to selectively amplify and sequence only the DNA that encodes the data in the block, thereby enabling the retrieval of a specific block of the value, as opposed to retrieving the entire value associated with a key.