Skip navigation

Friday night live with EDITH 0.1

The following was the Show and Tell 1 to the staff at education.au.

Edna Proof of Concept 2 Show and Tell 1

Pop Quiz! ..and you thought you didn’t have to do any work!

Q: How many current records are held in the edna collection (that is not including the distributed collections like ABC Online, but core edna)?
A: 36704

Q: What did the Dam Vam report estimate as the annual $ value to all educators of their use of the edna collection search?
A:
A) $1 Million
B) $2 Million
C) $4 Million <= correct

It been a huge amount of work invested over the years just in creating the records let alone creating and maintaining the edna collection.

When asked to innovate collection management we first had to understand 'what's hard about collection management'?

The answer lies in building the collection. Scouring the internet, evaluating each resource and describing the results. It's a long, knowledge intensive process.

So what's this Proof of Concept vision?

“Collection Improvement through user engagement and better metadata tools”

Breaking this down we are seeking to

  • Improve the quantity and quality of the edna collection
  • Maintain the integrity of the collection
  • Automate the drudge work
  • Engage with educators where they are online such as social bookmarking sites.

What I’d like to show now is how the way an Information Officer maintains the edna collection will change for the better through the Collection Management POC.

First let’s look at the way our IO, Jade, currently builds resources.

IOs Job: adding a resource

Jade’s job is to discover, evaluate and describe online resources for inclusion into the edna collection. Sounds simple doesn’t it?

DISCOVERY

Let’s have a look at Discovery.

Discover sourcesLike other IOs, Jade has her own list of preferred resource sources as well as aggregated resources.

These include
1) Education Websites
2) Online Journals
3) Online Media outlets
4) RSS news aggregators
5) Events networks
6) Email ListServes
7) Groups
8) Conference alerts
9) Google alerts
10) Government websites and news services

There are many, many more and as this slide indicates – Jade has to:
1) Remember to check each source
2) Take the time to visit them all.

She has:

3) Reduced time add value to the collection –
4) Lack of certainly about the resources’ relevance to educators

  • Wouldn’t it be better if there were fewer places for Jade to check?
  • Wouldn’t it be better if Jade knew what educators really want to keep?
  • How can we save Jade the IO’s time for more valuable activities?

You may have guessed why we have called her Jade – it’s not because she is jaded with her job – she loves it! But she is jaded with the repetitive nature of some of the work and frustrated by not getting to the really valuable part of her work because of the endless scouring of the same places on the internet and the adding of the same metadata over and over again. These are clues to possible points of automation and user contribution.

EVALUATION

EvaluateJade now has her list of URLs as potential inclusion into the collection – now she needs to evaluate. Evaluation, just like programming, is knowledge intensive.
Jade uses the edna collection policy to discriminate between the useful and the non-useful, the promotional and the edifying.

There are three collection management policies:

  • edna Governance Policy
  • edna Collection Policy
  • edna Collection Policy Schedules (sector specific)

These cover issues of

  • Accessibility
  • Authority – from reputable sites
  • Reliability – is the website going to be there tomorrow?
  • Uniqueness – primary source and value
  • Objectivity – impartiality and balance of views
  • Ethics and legality – respectful of law and decency

The second phase of our project will be to look at employing our user engagement and metadata tools to improve the evaluation stage. Show and Tell 2 will cover this in detail; it shows evaluation as a knowledge intensive process to maintain the integrity of the edna collection.

DESCRIBE

DescribeWith the evaluated resources, Jade can now describe the online resources.

DSPACE is the workshop of the edna collection.

Jade passes through DSPACE’s 6 phases, describing each resource – a lengthy process incorporating further checks, many metadata schemas, and interpretation of edna policy.

With hard work, persistence and hard work with the application of her knowledge and expertise, Jade will have grown the edna collection by at least 1 record. This is how Jade and the other IOs have grown edna Collection records to 36,000 records.

INNOVATION

So how are we innovating what’s hard about collection management? By engaging with educators where they ‘nest’ online eg the social bookmarking sites like Del.icio.us; by automating the drudge work.

EDITH

We do currently provide for users to suggest sites to edna. We ask them to submit and we receive an email, This is a very web 1.0 approach to things and it also results in very few actual edna entries. With our proposed model we will go out and fine what the users think is useful.

How will we do this? With EDITH. We are innovating Jade’s job with the EDITH engine: Edna Discovery Information Trapper and Hoarder engine. We’ll automate as much as possible and provide the IO with as much information as possible to assist their decision making process.

The EDITH engine is being designed to tap into social networking information repositories like Del.icio.us, Scuttle, Digg, Wikipedia. We are starting with delicious as a proof of concept, butt he model could be adapted to include other sources.

Quick Quiz

Q: Approximately how many internet users are there in the world?
A:
A) 700 million
B) 1.1 billion <-- correct
C) 1.9 billion

Q: What's NetCraft's best guess for the number of active websites in the world?
A:
A) 50 million
B) 100 million
C) 122 million <-- correct

And worth mentioning that’s only sites – Google searches over 8 billion web pages – and that’s not the whole web.

Wisdom Vs MassesQ: Approximately how many del.icio.us users are there in the world?
A:
A) 220 thousand
B) 370 thousand
C) 2 million <-- correct

MadnessQ: How many Information Officers are there in education.au?
A: 11

1.1 billion people using 122 million websites makes discovering educationally valuable online resources daunting. It’s impossible for 11 ed.au IOs to cover everything.

The POC team quickly realised that in fact there are ‘Accidental Information Officers’ out there in the Social bookmarking sites. How do we find them?

With 2 millions social bookmarking describing the web, the challenge for us was to separate the wisdom of the crowd from the madness of the masses.

The team set out to find the:

  • Authoritative Users and
  • Authoritative Tags

The solution was to find del.icio.us users who have tags which match the edna collection categories keywords. For example we discovered a user called ‘Mark Booker’. He loves to bookmark!

Another quantitative check employed is the number of times a resource has been bookmarked.

We further checked Mark Booker’s educational bookmarking credentials by checking if some of his bookmarks already exist in the edna collection. Passing this test Mark Booker has become an Accidental Information Officer.

The wiseIn fact there are there are 8,400 other ‘Mark Bookers’ out there. Nick is going to show you how we capture what they are bookmarking. Compare this figure of 8,400 with the 140 ed.au DSPACE users who have helped create the collection over 10 years.

This provides us a bookmark candidate list for evaluating and describing from this group of Accidental Information Officers. We are getting not fully evaluated resources coming to us but resources that are healthy candidates, coming from promising users and promising tags.

What have we actually achieved here? In fact we have answered our earlier questions:

Wouldn’t it be better if Jade didn’t have to check so many places?

Yes – EDITH will collate many sources into a single place.

Wouldn’t it be better if Jade knew what educators really want to keep?

Yes – EDITH discovers what the Mark Bookers believe is important to them.

How can we save Jade the IO’s time for more valuable activities?

  • Having greater coverage of the internet by more people
  • Reducing search time
  • Giving her more time to look in depth at complex resources

Is this all theory?

No. Results are real – new edna collection resources have already been added. Nick will tell you about the approach and results in detail.

What problems are there?

There are some questions to consider:

It is dependent on what people bookmark. PDFs are harder to bookmark for example.

Quantity is not an indicator of quality.

We need to think about potential copyright issues with mining people’s bookmarks.

The next Show and Tell will address issues of evaluation of quality; helping to create a shortlist from this mass of candidate bookmarks.

——-

Images of people for “Discovery: the internet” sourced from http://iconka.com and MS Office Clip art.

One Trackback/Pingback

  1. […] In our most recent demo project we used the edna resource collection to automate the discovery of authoritative del.icio.us users in areas we were interested in. Tom has posted the presentation we did on his blog, but the algorithm is worth outlining here (since I can hardly read my own whiteboard writing): […]

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*