Opinion
Algorithm bias and image search inside DAM can be problematic. There are several companies who are developing algorithms where image search can be accomplished with no metadata they claim.
The concept of no metadata in the DAM environment relies on algorithm built on something called “attention” where the idea is instead of scraping metadata from images to identify them and put them in context; the text surrounding images is used to infer context for the image. This new way to contextualize images is being put forward as “solely on visual elements”.
Is it though?
In today’s post I’ll talk about image search without metadata and algorithm bias.
Why get so up in your DAM business with this granular an opinion?
Because companies who are in the real world need real world solutions
that include widely diverse attention to how algorithms treat human
informed images and images categorized by humans. Which is what technology is essentially built on.
Companies use DAM systems and rely on those system algorithms for better or worse, to deliver assets and groups of assets reliably, and inclusively. It’s important. It’s not something “we’ll get to” anymore, it has to be a core value of technology companies. And algorithms have notoriously been lacking in their world views.
The idea that we can type in a search for something like “Woman CEO teaching course at university” or “Black Women Software Coders Talking” and get a return of precisely related photos to that search phrase is invigorating. I mean, have you ever had that happen where you got exactly what you were looking for?
Algorithms fall far short when searching - in any environment - if you're curating assets that relate to wider audiences than white people. Through turbulent 2020 use of cctv to identify people during unrest and protests increased - those results were often wrong as the flaws in facial recognition were uncovered and had some tragic outcomes.
And as we’ve seen recently the algorithm on tik tok is clearly biased.
Instagram is increasingly culpable in banning images their policies wouldn’t find objectionable while allowing images that violate them; often because of text surrounding them.
When companies respond to image recognition algorithm flaw criticism with comments that their tools aren’t being used correctly - it simply confirms how important it is to have a broader perspective on your product if you’re going to claim your algorithm is for the masses instead of blaming user error.
That’s because AI is still predominantly built without a global context - and yes, by that I mean AI continues to be built by males, white males, specifically, who have white male bias (conscious or un) and that gets baked into AI’s basic dna.
This means technology your DAM salesperson is telling you is cutting edge, tik tok, instagram, facebook, cctv and so many more, have bias baked into them but apply them without conscience. By that I mean, they are open to wildly inaccurate results, impacts on users and outcomes.
Ok, so what does that have to do with “All I’m doing is searching my DAM for some darned photos?”
Well, based on how software using algorithms touted as “nometadata” needed are being taught, they’re scraping text surrounding images to get the image context, then they’re grouping images together to serve up results.
They’re not looking at metadata but descriptive, human written text that is supposed to somehow describe the images more correctly than, or in place of, purposefully written metadata. This opens up another way for
AI to get things wrong in their algorithms because the contextual
information around images is as subject to “bad” information as “bad”
metadata is.
Who is describing the images being scraped? A simple question that technology tends to overlook.
So, while this will be a potentially controversial statement, if your company requires inclusivity in its marketing, branding, client-facing products, and work - and your DAM vendor is telling you they “do that” or their technology is built completely in-house, be sure to look at their executive team because it will help you identify how their algorithm may or may not fit for your needs.