SLIDE 1
Repositories and content addressable storage
A data repository needs to (among other things)
- Make sure data remains safe and uncorrupted
- Make sure data remains available
- If data is changed, previous version should be kept
Solutions available, but..
- Links to data break -- how to make sure that once a link is created it never
breaks?
○ Who keeps track of what is where?
- What if two files have different names but the same content (duplication)?
- Dealing with unexpected events
Many solutions used centralized systems
- Single point of failure, single entity in control
- What about doing all the above at scale? Big data etc.