RE: Want to make a filesystem. 12-01-2017, 07:14 AM
#6
(12-01-2017, 06:56 AM)phaz0n Wrote: https://en.wikipedia.org/wiki/InterPlane...ile_System
This seems to do a lot of what you are talking about.
Keep in mind that in distributed systems are much more complicated than their centralized counterparts. I am sure you will have no problem implementing the filesystem, but there is the entire p2p networking stack to worry about as well.
If the purpose is simply to store your own encrypted files securely, there is not enough incentive for other nodes to bear the cost of storing your encrypted data, which is useless to them. At this point you might as well just devise a centralized FS built for data redundancy and use of encryption.
Where distributed file systems really shine is not as a vault for selfish encrypted data storage but rather a cryptographically authenticated public lookup table of content, wherein each file is usually addressed by its hash. This allows nodes that share a file(s) of interest to efficiently store parts of the content, and share said parts among themselves.
In other words, there's no point distributing data you want to keep to yourself, especially since there is nothing to gain from someone storing your encrypted data for you. If you want the redundancy/fault tolerance then just write a normal FS for that IMHO.
That's exactly the opposite of what I'm talking about. What I'm proposing is using distributed hash tables, not a blockchain. I'm aiming for decentralized storage (there's no central ledger). Distributed hash tables are actually very simple, it's the network resolution of them that gets difficult, luckily TCP/IP and ARP exists and I don't have to do any of that sort of stuff. The point of making this decentralized is so that there is a near unlimited size of the disk, and it's very difficult for an attacker to piece together any part of it let alone a single file from the user. As for files referenced by hashes, this is what I was talking about. It's a distributed hash table. You can take a look at this project on my github about how hash maps work, DHTs are just a reduntant and decentralized superset of hash maps. The redundancy isn't a requirement of having a filesystem, it's a requirement of this specific type of file system. If one node goes offline, you would lose the data that's stored on that node, but if you had redundant nodes, no data is lost and another node replicates the missing data for future redundancy. It's there as a safety measure. Think of it like a RAID array. Each drive doesn't contain all of the data, but rather a piece of it (stored on a stripe). If one drive fails (assuming there's no parity), you would've lost the data on those stripes, but you wouldn't have lost the whole file. This isn't the best example because RAID makes use of parity (which would be a security vulnerability for this scenario), but the general concept is the same.
I understand the points you're trying to make, but none of them are really valid for this case. File systems, peer networking, graph theory, and kernel development are all particularly advanced topics, and I applaud you for joining in this conversation. You should track this project as time goes on, you could really pick up a lot of useful knowledge
(12-01-2017, 07:05 AM)Ender Wrote:(12-01-2017, 06:56 AM)phaz0n Wrote: https://en.wikipedia.org/wiki/InterPlane...ile_System
This seems to do a lot of what you are talking about.
Close, but not exactly.
Blockchain-based systems keep everything in one chain, which is distributed. The problem with this in currencies, is that there is the possibility of a 51% attack.
Earlier, I was talking to @"phyrrus9" about this. He is planning on using a distributed hash table for this, which is somewhat similar to a blockchain, but is split up into multiple parts, with no single node having all of it at once. Then an algorithm (he pointed me to Dijkstra's algorithm when talking about cryptocurrencies, not sure if that's what's being used here though) is used to distribute the data in the fastest way possible. In cryptocurrencies, this adds an extra layer of protection, since it's extremely hard to figure out how to make a fake valid transaction that gets routed to your specific node, and takes more computational power than is needed in Bitcoin and other systems.
No, Dijkstras wouldn't be used here. It's a very inefficient algorithm that's very simple. It makes a great teaching tool for basic graph theory but it's not particularly useful in practice. We'd be using a variant of least cost routing and a tuned hash algorithm to determine the fastest route to nodes as well as the optimal placement for redundant and primary nodes.