Sunday, 24 February 2019

Data4good conference

I am interested in open data and privacy for/in libraries, but I am also aware that I am always learning - and those areas are no exceptions. So I was very excited when in 2018 I was invited to be part of the Data4good conference which took place at the Library of Birmingham on 14 November. Not only was I going to a conference all about data, I was going to be on a panel in the session about responsible data and ethics! 😁

And so, thanks to Pauline Roche and the lovely people at DataKind UK, I went. The Data4good conference was organised by non-profit organisations involved in data work and aimed at fellow charities. I have to admit I was a bit nervous as to what attendees would think of a library person speaking at a conference aimed at the non-profit sector. What helped me (on top of reassurance from the organisers!) was thinking that actually, staff in public libraries and charities have a similar goal: we all work to help people, to improve their lives, society, environment. (And I am sure some of you are also thinking that some UK public library services are operated as/by non-profit organisations.)

Photo of Data4good conference bag and programme


"Combining data geekery with a desire to do good"
The programme was packed. The opening speakers did a great job of setting the tone of the conference and providing examples of using data for good. I am afraid the talk I now remember most is Karl Wilding's (director of policy at NCVO). He stated a few things that were probably obvious for the audience but needed to be said - and while he was talking I kept thinking things like: "This totally applies to the library sector too!" and "I wish library colleagues could hear this!" You can access his notes from the conference programme web page but here are three ideas I picked out:
  • "Think about how you can use data to tell a story, to change people's lives" - it's not about the data, it's what you use it for.
  • Sometimes we as organisations move faster than our members [also read colleagues, stakeholders...] We have to be mindful that not everybody agrees with us, that not everybody thinks this stuff is powerful.
  • On values and data ethics:
"There are a number of standards like GDPR on how we use data; I think as civil society we should go further than that."

Responsible use of data
The session I took part in was entitled How to innovate with data whilst being respectful [click to download the presentation]. I provided a short introduction to online rights and data privacy - how it impacts us as individuals and why it concerns non-profits.

Tom Walker from The Engine Room then gave concrete examples from the charity sector and introduced the Responsible Data Forum. The site is a place for anyone in the sector to get information about using data responsibly. There is also a very active RD mailing list for members of the community to discuss particular points further.

Responsible data definition (slide)
Responsible data definition - slide from Tom Walker's presentation
(click on the image if the text is too small)

If you read one page from the Responsible Data site, I would recommend RD 101: Responsible Data principles by Zara Rahman. Here's an extract that again struck me as particularly relevant for the libraries and information sector, and close to one of the messages I have been trying to promote through my privacy work.
"Holding ourselves to higher standards: In many cases, legal and regulatory frameworks have not yet caught up to the real-world effects of data and technology. How can we push ourselves to have higher standards and to lead by example? 
Working in social change and advocacy means we hold ourselves to a certain set of ideals. Profit isn’t our goal – positive social change is."

But back to the Data4good session: next Amy O'Donnell from Oxfam explained what they'd done to embed responsible data practices within the charity. Oxfam developed their responsible data policy in 2015 and went on to create a training pack for staff.

The training pack includes an "Agree or disagree?" deck of cards meant to get staff thinking - and talking - about responsible data management. Staff are encouraged to explain and discuss why they agree or disagree with the statement on their card. Some example statements:
  • Once we've collected information, it belongs to our organisation and we can do whatever we want with it.
  • Privacy is more important than transparency.
  • To make data anonymous, we should remove all names.

When I run privacy workshops, the moments I find most fascinating is when participants start asking this kind of questions and everyone else pitches in with their opinion and knowledge. I think Oxfam's "Agree or disagree?" cards are a good way to bring people to those debates. I'm already thinking of how I could use some of the cards, with extra statements more specific to libraries, in future workshops.

Oxfam's "Agree or disagree?" card deck
Oxfam's "Agree or disagree?" card deck

Data literacy
After lunch there was a "curiosity and cake" session - a concept close to an unconference with cupcakes. Each table had a pre-defined topic and a facilitator. Participants were asked to collect a cupcake on the way in, then sit at a table that had a topic they wanted to discuss and/or learn more about. I passed many tables I would have been interested in (data ethics and governance, open data sets), but eventually I settled at the data literacy table.

The discussion at the table covered:
  • What is data literacy? We agreed that it is about being able to collect data, use it and analyse it; but being able to do all this in an ethical way. 
  • Why should people be data literate? The answer to this relates to what your organisation is trying to achieve, or what your role is within the organisation. For example, in my library I want citizens to be data literate so they are in a better position to make informed decisions based on raw or visualised data. And I want colleagues to be data literate both so they can better uphold citizens' privacy; and so they make best use of the data we do collect to gain insights into the library service and in turn use these insights to inform their work.
  • How do you work towards uniform data literacy across the organisation? It is very hard to reach all individuals, at different levels of the organisation. We also talked about resources available to teach data literacy to staff in non-profit organisations, such as databasic.io and the IFRC Data Essentials playbook.
  • How can you engage non-data literate people with the importance of becoming data literate? How do we demystify data and data literacy? I mentioned the data days (e.g. Datamorphosis) we'd held at Newcastle City Library in 2016 and the next idea which would be a kind of "data as art" event: helping people feel less intimidated by data sets by making them visualise data through activities they may be more familiar with, such as art and craft. Very helpfully, another participant pointed out that I should plan my event not just around data visualisation but to include a data analysis stage, in which we would ask: "What does this visualisation tell you? What do you think? What should we do about it?"

Icon of a human head with an open box inside it
mind by Adrien Coquet from the Noun Project

Further thoughts
I'm going to stop here with the account from the conference sessions to try and order some of my thoughts instead.

Data literacy is a topic I am getting back into, as I am co-organising Voyage of the Data Treader 2 (a data camp for library staff) next month and I have also committed to organise at Newcastle Libraries another data day for citizens. While writing this blog post I took the time to explore databasic.io and I spotted several activities I would like to adapt and re-use for my own work. The Build a data structure activity is like a short version of the "data as art" event I want to run at Newcastle Libraries, with useful pointers on what to highlight as a facilitator. And I will be proposing Deconstruct a dataviz (perhaps mixed with another activity) as an unconference session at Voyage of the Data Treader; it looks like a great way to talk about what we would want to use library data for and building stories with data.

I recently subscribed to the Responsible Data mailing list, out of curiosity.
Having somewhere professionals from the same sector could talk about matters relating to data protection and citizens' privacy is something I've been discussing with Ben White (British Library) and Fred Saunderson (National Library of Scotland) in the past couple of years. They came up with the idea as they were working to make their organisations GDPR-compliant and thought colleagues across the libraries, archives and museums sectors would be going through similar things and have similar questions.
We thought it would be handy to have one place that curates useful sources of information for the sector(s), but also where colleagues could ask questions with everyone with a bit of experience contributing to the answers. As a result of our conversations we set up an informal group called CHIPA (for Cultural Heritage Institutions Privacy Alliance) with its own website and Twitter account - and yes, there is a mailing list ready. But it hasn't really taken off; one of the reasons for that is our own lack of time to get it moving further.
Now that I see how active the RD list is, there's the question: is something like it needed for our sector? Would a different way to share best practice be more useful? I'd be curious to read your thoughts!

Tuesday, 7 August 2018

Release your (library) data!

This article was written for the CILIP Public & Mobile Libraries Group journal and published in Access issue 18, Spring 2018.


Open data is information that is publicly available and has been placed under an open licence. "Open" means some (or all) of the copyright has been waived so that anyone is free to copy, use, modify and share the information, even for commercial purposes.

Newcastle Libraries have been publishing open data since March 2016; we started with information we already collected and therefore was easy to publish: number of loans, of visitors to libraries, titles in our catalogue, etc. Other UK public library organisations - for example Leeds Libraries, Plymouth Libraries, Libraries Unlimited; as well as LibrariesWest - have started releasing their own data, often publishing slightly different information.

Why open data and libraries
As library staff it is our role to facilitate access to and the sharing of information, knowledge and culture. We are used to giving citizens access to content created by others: access to books, music, films, images, online resources... but what about information we create ourselves? A lot of data about the library service is generated by the systems we use: our library management systems, computer bookings systems, automatic people counters, etc. or inputted e.g. addresses of libraries, opening hours, services offered at each site...

My opinion is that to fully fulfil our role as library staff we should also enable citizens to access the information we create about their library service. We are only the custodians of this information, and we should be giving it back to them in a way that will allow them to reappropriate it i.e. as open data.

What does open data mean for you?
Answers from participants at Next Library conference Open data mon amour workshop, June 2017.

Benefits of publishing open data
Sometimes I get the comment: "That all sounds well, but that's not going to convince my head of libraries. Why should we dedicate some time to publishing our information as open data over something else?" I'll be honest: in Newcastle Libraries we did it because we believe in it. And it worked. But here are some more arguments for you:
  • Commitment to transparency
Publishing information as open data shows a library authority’s commitment to transparency, by being up-front with the citizens it serves and providing them with authoritative data about its services.
It also means the library service cannot be accused of hiding anything since on the contrary it has made its information and statistics public and easy-to-find. In a way, it is bypassing a type of journalism looking for juicy stories or lowering the risk of those stories having a big impact.
  • Savings on FOI
Actively publishing data under an open licence saves time on some Freedom of Information requests, as the information will already be available and you can simply refer to where it is stored.
  • Benchmarking at a lower cost
If more and more library authorities publish information about their collections, their services, usage, etc. as open data and it is done in a standardised way (using a schema), we will be able to compare a library authority with another without having to pay (CIPFA) to get that information.
  • Demonstrating social impact
Libraries Unlimited have been running an Arts Council England-supported project that looks into using library data to better demonstrate the social impact of public libraries. It is worth reading about the project's outputs so far (on the Unlimited Value blog).
  • Unexpected help and insights
You never know what citizens are going to do with your open data. To be frank: they may not do anything with it, or they may do something that you won't hear about. But publishing your data gives the opportunity to citizens – some of whom may be data scientists, developers or simply curious individuals – to look at it, analyse it alongside other types of data, build something with it. They may contact you with their comments and findings, giving you new insights into your data and your library service. They may use your data in their research, giving your service some visibility in places you wouldn't reach. They may build a tool or app that would be useful for your service or your citizens.


Left icon: created by Universal Icons from Noun Project;
right: background icon created by Gan Khoon Lay from Noun Project, with added text.

How to publish open data
At the first (we're hoping to run more!) Voyage of Data Treader : library data camp 2017 at Liverpool Central Library last November, I facilitated a session on publishing your first dataset. We used loans figures as an example to discuss:
  • what data we wanted to publish: how "loans" were defined, where the data came from and how it was calculated, explaining all that alongside the raw data;
  • how the figures would be presented e.g. by year or by month? By library or by type of item loaned?
  • formats: for the dates and other elements within the data set but also for the file itself;
  • licences: Open Government Licence, Creative Commons or public domain;
  • where to publish: on the Council website, on a dedicated data portal.

National initiatives in England and in Wales
In England, the Libraries Taskforce worked with public library authorities to create an open data set which is a snapshot of libraries on 1st April 2010 and 1st July 2016. The Taskforce is now working on:
  • creating a schema – a standard way for all public libraries to release their data following the same rules so information from different libraries is easily comparable;
  • effortless ways for each public library service to regularly update the part of the dataset relevant to them.
In Wales, CILIP Cymru Wales has coordinated work to add all Welsh libraries to Wikidata. Wikidata is a Wikimedia project set up as a collaborative source of public domain data for other projects. That means anyone can add or edit information about a library on Wikidata – in a similar way to how Wikipedia is edited – as well as use the data to create useful things (in this project: maps and other visualisations).
Claire's post on Twitter

Made by us
Eventually, it's up to us library staff to discuss open data, learn new skills around data analysis and data visualisations, and perhaps even collaborate on projects. If you have any questions, feel free to contact me; if you're on Twitter and spot an interesting open data article or initiative, do use the #DataTreaders hashtag; and if you'd like to take part in an another library data camp event, please say so!

Sunday, 3 December 2017

The public library as a place for the sharing of culture

This case study of Newcastle Libraries was written by myself in December 2016 for inclusion in Fred Saunderson's and Gill Hamilton's book Open licensing for cultural heritage  published August 2017. This is one of two contributions published under a Creative Commons Attribution licence (the other being Merete Sanderhoff's "Small steps, big impact: how SMK became SMK Open").


Newcastle Libraries are the public libraries serving the citizens of the City of Newcastle upon Tyne, UK. Newcastle is the biggest city in the North East of England and its library service, in particular the City Library, attracts users from across the region and beyond. The City Library houses the local studies and family history collections - this section also regularly receives requests and enquiries from overseas customers.

In the early 2000s a funded project allowed Newcastle Libraries to digitise a large part of its local history photographic collections and to publish them on a dedicated website called Tyneside Life and Times. However, a few years later the website encountered technical difficulties and the photographs were moved to the Flickr image hosting platform in June 2009. When the Flickr albums were created the images’ legal status appeared as the default copyright setting. Download was originally disabled but this was changed early on, although this particular feature was never publicised. Apart from the Torday collection (a thousand photographs of 1960s-1970s Newcastle) which was digitised by a volunteer and uploaded to a new album, the historic images collection on Flickr has only been extended on an ad hoc basis.

In 2015 I started developing at Newcastle Libraries the Commons are forever project, with support from the Carnegie UK Trust’s Library Lab programme. Commons are forever aimed to empower people and inform them of their rights to use and re-use works that are either in the public domain or available under an open licence, and encourage them to in turn share their creations with others. The project took the form of a series of events where members of the public were invited to create their own artworks in workshops facilitated by local artists, while learning about copyright and where to find free-to-use content.
A secondary goal of the project was to firmly re-position the library service as a place for the sharing of culture. Public libraries are traditionally making knowledge and culture accessible through loaning materials to members of the community, but I believe raising awareness of works that are out of copyright - in the public domain i.e. that belong to all - or under open licenses is also part of libraries’ role. On that basis, it made sense to me to use Commons are forever to also promote resources that are part of Newcastle Libraries’ collections and have entered the public domain. Since we were promoting free-to-use materials as part of the project we also needed to apply those sharing principles to our collections and our services.

The first step would be to correctly re-label the local history images on Flickr from “copyright - all rights reserved” to “public domain”. In order to get this agreed and done I started talking to colleagues in June 2015. It emerged that the issue was less about owning copyright over the digitised pictures and more about enforcing an indication of provenance: people who were not using our pictures for commercial ventures should be able to use them for free but be obliged to mention they were from our collections. However, it was felt that claiming copyright was still important because we were the keepers of the collection: if people want to make money from using our pictures then the library should get something too, and it should be clear that the images came from Newcastle Libraries. As we were selling copies of our pictures, the potential loss of income was mentioned - at a time of budget reductions even the small amount we were making may become significant.

Swing Bridge, Newcastle upon Tyne, 1889
From the Newcastle City Library Photographic Collection

After this initial meeting the conversation stalled as changing this particular policy which had been in place for a while was not part of the team’s priorities. The topic was picked up on several occasions over the following year and the number of people involved in the discussions was extended to the wider group of librarians. To get colleagues to understand why I wanted the Flickr images’ status changed from in copyright to public domain I used arguments such as: “because you’re trying to claim rights that you probably don’t have, what we are doing now is slightly illegal but also ethically wrong”!

Towards the end of the Commons are forever project the focus moved from sharing creative works to sharing data and information collected by the library service. We released performance statistics and usage figures as open data - using the UK Open Government Licence (OGL) which allows anyone to re-use the information in any way, as long as the source of the information is credited. In April 2016 we ran a one-day hackathon when we invited members of the community to “play” with our open data. For the occasion we were also given permission to publish 31 digitised historical maps of Newcastle from the libraries’ collections - in the public domain, clearly labelled as such in a Flickr album. The maps proved quite popular, with several participants using them to superimpose “old Newcastle” to a current map to highlight the evolution of the city centre.

I think what happened with the maps helped to show colleagues what releasing our information and content meant, and more importantly that it did not harm the library service. On the contrary, it was interesting to see what citizens had done with our maps when appropriating them - re-using them in ways we had not thought of and contributing to the visibility and reach of our collections.

Plan de Newcastle ou Neuchastel
From the Newcastle Libraries collection

In August 2016 we changed the status of our local history images on Flickr to “public domain”. Each album now bears the mention:
“These images are, to the best of our knowledge, in the public domain. You are welcome to use them in any way you like – we would love it if you could say you got them from the Newcastle City Library Photographic Collection. If you want to use the images for commercial purposes we can provide you with a high quality digital image for a fee – just contact us.”
On the spur of the moment, it was also decided to move the Torday collection (the copyright of which had been assigned to Newcastle Libraries) into the public domain - under CC0.

We were pleased to see this initiative bear fruits a few months later, with an article in a local newspaper about Newcastle’s old Odeon cinema featuring several of our Flickr images - all in the public domain but nevertheless used with the mention “from the Newcastle City Library Photographic Collection”.

Around the same time we changed the status of of our local history images on Flickr to “public domain”, we also decided to stop using OGL for our open data and use CC0 instead, making it even easier for our information to be re-used.

In December 2016 we went further: we librarians agreed that in the future all Newcastle Libraries collections and documents published online would be made open by default. All public domain materials digitised from our collections will be clearly labelled as such when published. Materials created by library staff - images, event pictures, information booklets, training guides, etc. - will be published under a Creative Commons Attribution license. In 2017 we will start making more of our content available via platforms such as Flickr and GitHub.



Thursday, 28 September 2017

Supporting citizens with protecting their privacy online

This post is based on my talk at the CILIP Conference on 6th July, which I wrote-up for K & IM Refer: Journal of the Knowledge and Information Management Group (CILIP). This article has been published online as part of K & IM Refer Autumn 2017 issue.

All the technology around us - cameras, phones, our internet use, online communications, etc. - collects data about us. For example: most of us carry a smartphone around all the time. How many of us are fully aware that if the GPS is on, our phone company can pinpoint where we are with an accuracy of 5 to 8 meters? If the phone company knows, who may also have access to our location data? Are we comfortable with this situation? Would you change your behaviour and turn off your GPS when you don't use it now you know this, or would you decide the convenience outweighs the disadvantages?

Privacy is about choice. As citizens, we need to be aware of this situation to be able to make informed decisions about whether we want to protect some of our data and how much effort we are ready to put into protecting our privacy. Once we have the facts we also need the skills: we need to know about tips and tools available to help us protect our information.

Libraries defend people's rights
I believe that libraries exist to defend people’s right to enrich and improve their own lives, their environment and society. We library and information professionals make this happen by facilitating access to and the sharing of information, knowledge and culture.

In many sectors library and information professionals already devise and deliver digital skills training, ranging from a basic introduction to computers to searching online resources effectively. Knowing how to protect one’s privacy online is part of those digital literacy skills everyone should have; that's why at Newcastle Libraries we have started looking into how we could best help our citizens.


Learning about privacy issues and tools
Our team's awareness of privacy issues originally came from reading technology articles or from initiatives in libraries in other countries such as France or the USA. American librarians have created very useful materials that are a good place for us in the UK to start learning – I would particularly recommend the Library Freedom Project and the Data Privacy Project.
In Scotland the Scottish PEN has also been delivering Libraries for privacy: digital security workshops with support from CILIP Scotland and the Scottish Library and Information Council. I was able to attend one of those workshops, which inspired me to create a short training session for colleagues at Newcastle Libraries. I initially ran two sessions for librarians and senior managers in March 2017, and will be rolling it out to as many staff as possible this autumn. The first two sessions included time for us to discuss and decide what we wanted to do in our service regarding online privacy.

Initiatives for citizens
We wanted to offer information and training about protecting one's privacy online to local citizens. In 2016 we had already co-organised two cryptoparties; we decided we should host some more. A cryptoparty is an informal gathering of individuals to discuss and learn about tips and tools for privacy and security in our digital world. We co-organised ours with local members of the Open Rights Group who have the relevant technological knowledge that we might lack (!) - in partnership with the same individuals, our next cryptoparty will take place in November.
We have also noticed that cryptoparties tend to attract citizens who are already aware of privacy issues. How do we reach out to those who do not (yet) have that awareness? It is something that we are still exploring. One idea we want to implement is to include privacy among the topics covered in our digital skills sessions, but we are also trying to find other ways to, in a way, talk about privacy in a skills session without first telling people that we are.


Standing up for citizens' privacy
With Newcastle Libraries colleagues we felt that we could not be teaching citizens about tools to protect their privacy on the Internet and yet say: “By the way, this does not apply when you are using library computers or services”! We want to offer our computer users an Internet browser with enhanced privacy features – ideally, this would be Firefox with DuckDuckGo as the default search engine plus add-ons such as HTTPS Everywhere and Privacy Badger. I would love for us to offer Tor Browser or even for the library to be a Tor relay; however, I thought asking first for Firefox would be a lot less controversial... We are in conversation with our IT department; they have objections but these are about the practicalities of applying updates to the Firefox browser, which they cannot manage centrally like they currently do for Internet Explorer and Google Chrome.

An easier thing we can and will do is to be more transparent to citizens about how their information is handled when they use Newcastle Libraries services. When you use a library computer, you should be aware that our IT department records which websites you visit and that this information is kept for 12 months. When you use our e-books platform, we should tell you before you login what our supplier does with your data. It may take some time but it is relatively easy for us to add this kind of information on our website and other materials.
Once we start with this work we can review what we record – should we really be keeping your browsing history for this long? What is it used for; are we legally obliged to do so? Regarding third-party providers of library services, we should be requesting that they take steps to protect your data to our standards.
In truth, what we need is a privacy policy – the American Library Association Office for Intellectual Freedom has some fantastic information and templates adapted to the US context but that still gives us some useful pointers. Privacy terms and policies is a bigger piece of work but it is one we can build one chapter at a time, in order to support citizens with protecting their privacy online.