Lorna Campbell » automatic metadata generation http://blogs.cetis.org.uk/lmc Cetis Blog Tue, 27 Aug 2013 10:29:30 +0000 en-US hourly 1 http://wordpress.org/?v=4.1.22 When automatic metadata generation goes bad… http://blogs.cetis.org.uk/lmc/2009/11/24/when-automatic-metadata-generation-goes-bad/ http://blogs.cetis.org.uk/lmc/2009/11/24/when-automatic-metadata-generation-goes-bad/#comments Tue, 24 Nov 2009 11:07:42 +0000 http://blogs.cetis.org.uk/lmc/?p=252 Or the strange case of Drs E. Embuggerance and H. Feisty.

This has already been reported on several other blogs but it’s too good not to share again. Looks like Google Scholar needs to work on its automatic metadata generation algorithm:

Embuggerance, E., and H. Feisty. 2008. The linguistics of laughter. English Today 1, no. 04: 47-47.

This curious incident was originally reported by Stephen Chrisomalis and subsequently picked up by Language Log. The comments on the latter post are particularly entertaining. Mark Liberman helpfully provides Google Scholar’s BibTex citation

@article{embuggerance2008linguistics,
title={{The linguistics of laughter}},
author={Embuggerance, E. and Feisty, H.},
journal={English Today},
volume={1},
number={04},
pages={47–47},
year={2008},
publisher={Cambridge Univ Press}
}

And goes on to suggest

“Perhaps we should continue the tradition of metonymic names for new linguistic natural kinds, and use embuggerance for cases where the automatic tagging of entities and relations goes astray.”

Over on Stephen Chrisomalis blog the comments have taken a rather different turn and degenerate into a rather thoughtful discussion of the relative merits of Google Scholar and JSTOR and automatic metadata generation vz human indexing. One commentor, Laughingrat, is appalled that any academic would even consider using Google scholar:

“…unless your college or university is extremely underfunded, the school library should have access to high-quality databases which contain records indexed by information professionals rather than unqualified hirelings or, worse, computers.”

Another commentor, Dale, puts forward a robust argument in favour of Google Scholar in particular and automatic metadata generation in general:

“Many in libraries and academia are keen to point out all of the warts in Google Scholar, but are less keen to be so critical of the databases for which they pay. That the MLA Bibliography, for example, is years behind in indexing scores of journals, and has incredibly poor coverage in many non-English languages (despite the International boast in its name) is a little known or explored fact in libraries. Other fee-based databases evince similar flaws (Library Literature, ironically, is one of the worst), but it isn’t nearly as much fun to pick on them as it is to shellac Google.”

The last words also has to go to Dale:

“What it comes down is machine indexing vs. human indexing. I cannot get the image of John Henry out of my mind when I think about this matchup. I think the human indexers can only win by extreme effort, and we all know what happened to poor John.”

Amen to that!

]]>
http://blogs.cetis.org.uk/lmc/2009/11/24/when-automatic-metadata-generation-goes-bad/feed/ 0