Author Topic: meta  (Read 103448 times)

Dave the Necrobumper

  • Objectively Awesome
  • ******
  • Posts: 12452
  • If I keep digging maybe I will get out of this hol
Re: meta
« Reply #770 on: April 22, 2021, 08:26:02 AM »
Thanks for that, really interesting to see the "fun" you had to find a solution to the problem without direct back-end access. You went down quite a rabbit hole. Do you have any other similar projects in the pipeline?

I think 1SO has a full copy of the site from a few months ago (it was over 5gb if I remember).

My current (very slow progressing) project is to put together an app to record my movies watched and the reviews, then automatically post the watch on iCheckMovies and LettrBoxd, and also the reviews on the forum. Still in very early days getting the basics of the data display. Part of the problem is I have not planned the design out, so I am making things and dumping them. I am doing it in VB, but I might use that to create some objects that can be used via a web front end, instead of doing the front end in Windows forms. It is basically a bunch of ideas waiting for some order.

smirnoff

  • Objectively Awesome
  • ******
  • Posts: 26151
    • smirnoff's Top 100
Re: meta
« Reply #771 on: April 22, 2021, 10:29:30 AM »
I did help my girlfriend recently get her digital photo archive organized. She's not a fan of having stuff on the cloud (seems common in Germany), so she's just got everything on a 1TB portable drive. And that's it. No redundant backups. I was too scared for her data, particularly her photos. I wanted to get it backed up at least one more time somewhere else. The trouble was that she had like 600gb worth of photos and we didn't have another drive large enough kicking around. I forget the original number but I think it was over 100,000 files.

My goal was to stick the data on a few (like 2 or 3) writable dvd's that were kicking around, but that would take a lot of paring.

This drive was just a dumping ground from years of different digital cameras photos, phone photos, and photos shared with her over whatsapp or sms. Everything just got kind of stuffed willy nilly into one place, and it brought with it whatever file structure and file type it had originally. So there was every image format under the sun. Panasonic's .rw2 proprietary raw image format, .HIEC files from her iphone, Tiffs, jpegs, pngs... you name it. Also there were video files mixed in amongst the photos. Not only that but  things often existed in duplicate, and with different filetypes and/or file names. All scattered through various folders.

I used Space Sniffer to try and get a sense of what was taking up most of the space. Such a great tool for quickly visualizing data and targeting the biggest hogs. It's quick and easy to filter the data by file types, or file names, or any other sort of thing you could think to filter by. The videos were the biggest culprit of course. I just decided to not bother with them. That left all the raw image files. 15 to 20 megs a piece. Worst of all was the duplicates though. I figured before compressing any images I'd try and sort out the duplicates and keep highest quality versions.

That took me to VisiPicss. Omg, what a genius peice of software. It goes so far beyond simply checking for duplicate filenames. It looks for actual image matches, regardless of format, filename or resolution. You even have a slider you can use to adjust how aggressive it is in finding matches. At the default settings it did great job really only identifying true image matches. In only a few cases did it match images that weren't technically the same (usuallly an identical landscape shot with a person way in the background in two slightly different positions, the photo taken twice in rapid succession). In these cases I didn't find that keeping one version over the other was consequential so I didn't worry about false positives much with those settings. And once it's done it's scan (in my case several hours later), you basically just press one button to automatically select all the duplicates. You have a chance to flip through and review them, to see that the software did it's job okay, and then press delete and they all go away. The software seemed to be smart enough to always default to leaving the highest quality image.

The only catch I had was that VisiPics couldn't handle certain of the raw image formats. If I recall correctly I used FastStone Image Viewer to handle all of the .RW2 conversion, and Imazing HEIC convertor for the HEIC files. Then after getting the raw stuff converted to PNG I ran Visipics again to remove any new duplicates that had been introduced by making the raw files an accessible format. After finally being satisfied there was no more duplicates I think I was left with 45,000 ish files.

From there I just wanted to get the file size down, especially the PNG's. I originally had planned to use photoshop to batch convert everything to JPEG, and drop the resolution, but I found FastStone's interface was just way easier to work with and the JPEG's not noticably worse for their size than what photoshop was producing. Also, it was way easier to make slight adjustments in FastStone, and preview the results. Setting up the bulk conversion and size reduction was really easy, and it had all the options you'd want for such a process.

In the end I got it down to 45,000 images in 8 gigs. Still way too many photos to be easily browsed, and nobody has spent the time to delete the obviously flawed ones (blurry). I reckon you'd knock out at least another couple thousand pictures. Then there's all the photos that are not duplicates per say, but 10 versions of nearly the same thing. These take time to go through, since there is usually one which does stand out as a clear "best". But that's a project for another day (and not mine, hehe). Also, organizing them into a file structure than is chronological or some other event based or something would be a good thing.

I just think if you're going to keep photos, then just keep what's worth keeping. If it's just an ugly data dump, you're probably never going to look at them again because it will be too furstrating to flip past so much sameness. Every photo should spark joy (Marie Kondo style) or why keep it.



I dunno, I have strong feelings about these things. I digitized my parents photo albums a few years back as a sort of christmas present. It really made me appreciate the type of photo that's worthwhile. Most of the photos were between 10 and 40 years old, so they've had a lot of time to settle (as in historically, in your mind). The number one type of photo I'd say that I eliminated (i.e. didn't bother scanning) were postcard-like shots without anybody in the photo. A vista, a building, a landmark, etc. Usually from some trip. I'd keep ONE (the best one) to establish where we were, but the others were just another amatuer photo of x_famous_place, or not so famous place. But if it had people in it, someone from the family or a friend, etc, it made all the difference. "eh, look at your clothes and your hair!" "Oh it's so and so, I haven't seen them in years..."

It was the mundane pictures that were the real treasure in the collection. Grandma's birthday photo where you could see half of "that old green fake leather chair" in the background... or a picture of your mudroom with your coats and shoes. That stuff brings a lot back to you. It's powerful. But who takes a picture of their mudroom? Why would you? But those are the photos that everybody spend the most time pouring over. Looking at the little details of our daily life and remembering how it was.

"Say cheese" type pictures were hit and miss. I'd usually keep one or two of the best ones from each occassion, or more if further pictures featured rare faces. Pictures taken of people who didn't know they were having their picture taken almost always better.

Probably the rarest picture type of all (keeping in mind I'm speaking exclusively about film era pics, when people didn't experiment or practice taking pictures) were the "just good photography" type. By accident or on purpose, conditions were just right and someone captured a thing in such a way that it showed craft... or art.
« Last Edit: April 22, 2021, 11:08:50 AM by smirnoff »

1SO

  • FAB
  • Objectively Awesome
  • ******
  • Posts: 35694
  • Marathon Man
Re: meta
« Reply #772 on: April 22, 2021, 03:12:07 PM »

smirnoff

  • Objectively Awesome
  • ******
  • Posts: 26151
    • smirnoff's Top 100
Re: meta
« Reply #773 on: April 22, 2021, 03:18:35 PM »
:))

smirnoff

  • Objectively Awesome
  • ******
  • Posts: 26151
    • smirnoff's Top 100
Re: meta
« Reply #774 on: April 23, 2021, 02:58:08 AM »
My current (very slow progressing) project is to put together an app to record my movies watched and the reviews, then automatically post the watch on iCheckMovies and LettrBoxd, and also the reviews on the forum. Still in very early days getting the basics of the data display. Part of the problem is I have not planned the design out, so I am making things and dumping them. I am doing it in VB, but I might use that to create some objects that can be used via a web front end, instead of doing the front end in Windows forms. It is basically a bunch of ideas waiting for some order.

It seems a worthy project. I like to have a record of my viewings and my thoughts, and tools like letterboxd and icheckmovies are great, but taking time to maintain them all is tiresome.

At times I've wondered if it would possible to integrate an invisible tagging system into the boards. Like if I had just watched Bad Boys (1983) and written a review of it, I wish I could tag the title so it referred specifically to the 1983 Bad Boys, and that tag would be searchable. Then someone specifically wanting to search for mentions of 1983's Bad Boys could search for it by it's unique tag, and wouldn't get a bunch of false positives with the Will Smith Bad Boys franchise. I think the best way would be to use imdb's unique identification numbers from their urls. So like:



[imdb=tt0085210][b]Bad Boys[/b][/imdb]

Blah blah blah. Sean Penn . Blah blah blah.



Then you could query "tt0085210" and get just what you're looking for. It probably doesn't make sense to start integrating something like that at this stage though. I can't think of a realistic way to retroactively apply it to existing content.

It's the strength of letterbox I guess that reviews are all attached to their films directly, but the consequence is that the community dynamic breaks down because people's attention and discussion is scattered. I like the "respond to" thread because it all happens in one place. It's just that it can be hard to find mentions of a specific film within it.

 

love