![]() This would likely just be a local git repo since it would likely be several terabytes of info, but that would be the general idea I guess. Then you build the git repo history from all of the files. Effectively you reconstruct the git commit tree with the set of all files over all histories. My best thought is to make some git repo and add things in, but to do a levenstein distance on the bits of each file to check if there is overlap in content and to estimate the 'lineage' of a file if there is significant overlap with another. Got can help for the notes from Google keep that may have had things appended to or removed, photos can be overlapping a bit so really a set union is all that's required for many files, but some will be slightly different like the Google keep notes. What's the best tool to merge all of this into one directory? So now you have a big folder with many zips, and maybe some extracted folders, because things happen over the years, etc. Then in 2007, you have another 20 zips, and delete 25% of your emails and photos to make more room, 2008 again, on up to now. So for example in 2001, you make a takeout with 30 zips, and then delete half of your photos off of Google. ![]() The problem isn't one takeout overlapping (multiple zips from one date) it's many takeouts over the years (full history). It is useful to extract the files into a directory without the tar or gz, but this can also cause issues with how to appropriately organize the directory structure over the history.Īny thoughts or projects that do a better job of this? Throwing in the towel completely, borgfs can be helpful to reduce the amount of space they take by de-duplication on the block level, but this is a terrible solution as it doesn't really track file changes in a reasonable way, etc. There's vimdiff or meld, but they are extremely manual and tedious to the extent of being pointless to try for something like a large history of takeout tar.gz's. However, this doesn't seem to exist or be popular as a tool. Git plus some parsing seems close in that space, as analyzing the files to create a dendrogram like tree of potential alterations to files over time by levenstein distance may be useful to approximate commit history. What is the correct tool to properly merge a large set of tar.gz files for which may have an enormous overlap of similar files, and some that have been altered just slightly? The data is just organized as files within a folder on disk, with a SQLite DB holding the index and the small textual items. It's really quite immersive and magical and I haven't seen anything quite like it.Īnd everything is stored on your own computer, it's a GUI app and you have to have enough space to store your stuff. So when we went on our honeymoon, I can see text messages received from friends while we were driving to a beach. We can even place non-geolocated data on a map since we can correlate timestamp and entity. It's neat because I have my family pictures, my text messages between me and my wife when we were dating (and after of course), and there's different views to explore: map, timeline, conversations, gallery, and more to come (calendar, etc). It's basically a really detailed view of your life and online history. Timelinize is entity-aware, and it can map identities across data sources (with enough info, or with a manual mapping, or some optional heuristics). It also supports Facebook, Twitter, and Instagram account exports too. It's also for your text messages and emails. It's not just Google Photos: it's any photos and videos. It has evolved a lot since it's a very ambitious project and there's nothing quite like it. Well in 2019 I finally started working on a viewer. For years it was only focused on downloading the data using APIs - but then we found out that Google strips location data (from your own photos!) if using the API, so I added Takeout support. Saving a local copy of my Google Photos has been a passion project of mine since ~2014 (before Google Photos even!). If you want an invite to try out an early dev preview today, follow on Twitter and tweet at it, I'll see about getting you into the Discord. More to come!) (There's no website or project page yet because I've been busy developing.) It's called Timelinize (might rename it?), and you can follow it here: (Click "Media" on the Twitter account to view a few screenshots for a preview.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |