Tuesday, 7 August 2018

Release your (library) data!

This article was written for the CILIP Public & Mobile Libraries Group journal and published in Access issue 18, Spring 2018.

Open data is information that is publicly available and has been placed under an open licence. "Open" means some (or all) of the copyright has been waived so that anyone is free to copy, use, modify and share the information, even for commercial purposes.

Newcastle Libraries have been publishing open data since March 2016; we started with information we already collected and therefore was easy to publish: number of loans, of visitors to libraries, titles in our catalogue, etc. Other UK public library organisations - for example Leeds Libraries, Plymouth Libraries, Libraries Unlimited; as well as LibrariesWest - have started releasing their own data, often publishing slightly different information.

Why open data and libraries
As library staff it is our role to facilitate access to and the sharing of information, knowledge and culture. We are used to giving citizens access to content created by others: access to books, music, films, images, online resources... but what about information we create ourselves? A lot of data about the library service is generated by the systems we use: our library management systems, computer bookings systems, automatic people counters, etc. or inputted e.g. addresses of libraries, opening hours, services offered at each site...

My opinion is that to fully fulfil our role as library staff we should also enable citizens to access the information we create about their library service. We are only the custodians of this information, and we should be giving it back to them in a way that will allow them to reappropriate it i.e. as open data.

What does open data mean for you?
Answers from participants at Next Library conference Open data mon amour workshop, June 2017.

Benefits of publishing open data
Sometimes I get the comment: "That all sounds well, but that's not going to convince my head of libraries. Why should we dedicate some time to publishing our information as open data over something else?" I'll be honest: in Newcastle Libraries we did it because we believe in it. And it worked. But here are some more arguments for you:
  • Commitment to transparency
Publishing information as open data shows a library authority’s commitment to transparency, by being up-front with the citizens it serves and providing them with authoritative data about its services.
It also means the library service cannot be accused of hiding anything since on the contrary it has made its information and statistics public and easy-to-find. In a way, it is bypassing a type of journalism looking for juicy stories or lowering the risk of those stories having a big impact.
  • Savings on FOI
Actively publishing data under an open licence saves time on some Freedom of Information requests, as the information will already be available and you can simply refer to where it is stored.
  • Benchmarking at a lower cost
If more and more library authorities publish information about their collections, their services, usage, etc. as open data and it is done in a standardised way (using a schema), we will be able to compare a library authority with another without having to pay (CIPFA) to get that information.
  • Demonstrating social impact
Libraries Unlimited have been running an Arts Council England-supported project that looks into using library data to better demonstrate the social impact of public libraries. It is worth reading about the project's outputs so far (on the Unlimited Value blog).
  • Unexpected help and insights
You never know what citizens are going to do with your open data. To be frank: they may not do anything with it, or they may do something that you won't hear about. But publishing your data gives the opportunity to citizens – some of whom may be data scientists, developers or simply curious individuals – to look at it, analyse it alongside other types of data, build something with it. They may contact you with their comments and findings, giving you new insights into your data and your library service. They may use your data in their research, giving your service some visibility in places you wouldn't reach. They may build a tool or app that would be useful for your service or your citizens.

Left icon: created by Universal Icons from Noun Project;
right: background icon created by Gan Khoon Lay from Noun Project, with added text.

How to publish open data
At the first (we're hoping to run more!) Voyage of Data Treader : library data camp 2017 at Liverpool Central Library last November, I facilitated a session on publishing your first dataset. We used loans figures as an example to discuss:
  • what data we wanted to publish: how "loans" were defined, where the data came from and how it was calculated, explaining all that alongside the raw data;
  • how the figures would be presented e.g. by year or by month? By library or by type of item loaned?
  • formats: for the dates and other elements within the data set but also for the file itself;
  • licences: Open Government Licence, Creative Commons or public domain;
  • where to publish: on the Council website, on a dedicated data portal.

National initiatives in England and in Wales
In England, the Libraries Taskforce worked with public library authorities to create an open data set which is a snapshot of libraries on 1st April 2010 and 1st July 2016. The Taskforce is now working on:
  • creating a schema – a standard way for all public libraries to release their data following the same rules so information from different libraries is easily comparable;
  • effortless ways for each public library service to regularly update the part of the dataset relevant to them.
In Wales, CILIP Cymru Wales has coordinated work to add all Welsh libraries to Wikidata. Wikidata is a Wikimedia project set up as a collaborative source of public domain data for other projects. That means anyone can add or edit information about a library on Wikidata – in a similar way to how Wikipedia is edited – as well as use the data to create useful things (in this project: maps and other visualisations).
Claire's post on Twitter

Made by us
Eventually, it's up to us library staff to discuss open data, learn new skills around data analysis and data visualisations, and perhaps even collaborate on projects. If you have any questions, feel free to contact me; if you're on Twitter and spot an interesting open data article or initiative, do use the #DataTreaders hashtag; and if you'd like to take part in an another library data camp event, please say so!

Sunday, 3 December 2017

The public library as a place for the sharing of culture

This case study of Newcastle Libraries was written by myself in December 2016 for inclusion in Fred Saunderson's and Gill Hamilton's book Open licensing for cultural heritage  published August 2017. This is one of two contributions published under a Creative Commons Attribution licence (the other being Merete Sanderhoff's "Small steps, big impact: how SMK became SMK Open").

Newcastle Libraries are the public libraries serving the citizens of the City of Newcastle upon Tyne, UK. Newcastle is the biggest city in the North East of England and its library service, in particular the City Library, attracts users from across the region and beyond. The City Library houses the local studies and family history collections - this section also regularly receives requests and enquiries from overseas customers.

In the early 2000s a funded project allowed Newcastle Libraries to digitise a large part of its local history photographic collections and to publish them on a dedicated website called Tyneside Life and Times. However, a few years later the website encountered technical difficulties and the photographs were moved to the Flickr image hosting platform in June 2009. When the Flickr albums were created the images’ legal status appeared as the default copyright setting. Download was originally disabled but this was changed early on, although this particular feature was never publicised. Apart from the Torday collection (a thousand photographs of 1960s-1970s Newcastle) which was digitised by a volunteer and uploaded to a new album, the historic images collection on Flickr has only been extended on an ad hoc basis.

In 2015 I started developing at Newcastle Libraries the Commons are forever project, with support from the Carnegie UK Trust’s Library Lab programme. Commons are forever aimed to empower people and inform them of their rights to use and re-use works that are either in the public domain or available under an open licence, and encourage them to in turn share their creations with others. The project took the form of a series of events where members of the public were invited to create their own artworks in workshops facilitated by local artists, while learning about copyright and where to find free-to-use content.
A secondary goal of the project was to firmly re-position the library service as a place for the sharing of culture. Public libraries are traditionally making knowledge and culture accessible through loaning materials to members of the community, but I believe raising awareness of works that are out of copyright - in the public domain i.e. that belong to all - or under open licenses is also part of libraries’ role. On that basis, it made sense to me to use Commons are forever to also promote resources that are part of Newcastle Libraries’ collections and have entered the public domain. Since we were promoting free-to-use materials as part of the project we also needed to apply those sharing principles to our collections and our services.

The first step would be to correctly re-label the local history images on Flickr from “copyright - all rights reserved” to “public domain”. In order to get this agreed and done I started talking to colleagues in June 2015. It emerged that the issue was less about owning copyright over the digitised pictures and more about enforcing an indication of provenance: people who were not using our pictures for commercial ventures should be able to use them for free but be obliged to mention they were from our collections. However, it was felt that claiming copyright was still important because we were the keepers of the collection: if people want to make money from using our pictures then the library should get something too, and it should be clear that the images came from Newcastle Libraries. As we were selling copies of our pictures, the potential loss of income was mentioned - at a time of budget reductions even the small amount we were making may become significant.

Swing Bridge, Newcastle upon Tyne, 1889
From the Newcastle City Library Photographic Collection

After this initial meeting the conversation stalled as changing this particular policy which had been in place for a while was not part of the team’s priorities. The topic was picked up on several occasions over the following year and the number of people involved in the discussions was extended to the wider group of librarians. To get colleagues to understand why I wanted the Flickr images’ status changed from in copyright to public domain I used arguments such as: “because you’re trying to claim rights that you probably don’t have, what we are doing now is slightly illegal but also ethically wrong”!

Towards the end of the Commons are forever project the focus moved from sharing creative works to sharing data and information collected by the library service. We released performance statistics and usage figures as open data - using the UK Open Government Licence (OGL) which allows anyone to re-use the information in any way, as long as the source of the information is credited. In April 2016 we ran a one-day hackathon when we invited members of the community to “play” with our open data. For the occasion we were also given permission to publish 31 digitised historical maps of Newcastle from the libraries’ collections - in the public domain, clearly labelled as such in a Flickr album. The maps proved quite popular, with several participants using them to superimpose “old Newcastle” to a current map to highlight the evolution of the city centre.

I think what happened with the maps helped to show colleagues what releasing our information and content meant, and more importantly that it did not harm the library service. On the contrary, it was interesting to see what citizens had done with our maps when appropriating them - re-using them in ways we had not thought of and contributing to the visibility and reach of our collections.

Plan de Newcastle ou Neuchastel
From the Newcastle Libraries collection

In August 2016 we changed the status of our local history images on Flickr to “public domain”. Each album now bears the mention:
“These images are, to the best of our knowledge, in the public domain. You are welcome to use them in any way you like – we would love it if you could say you got them from the Newcastle City Library Photographic Collection. If you want to use the images for commercial purposes we can provide you with a high quality digital image for a fee – just contact us.”
On the spur of the moment, it was also decided to move the Torday collection (the copyright of which had been assigned to Newcastle Libraries) into the public domain - under CC0.

We were pleased to see this initiative bear fruits a few months later, with an article in a local newspaper about Newcastle’s old Odeon cinema featuring several of our Flickr images - all in the public domain but nevertheless used with the mention “from the Newcastle City Library Photographic Collection”.

Around the same time we changed the status of of our local history images on Flickr to “public domain”, we also decided to stop using OGL for our open data and use CC0 instead, making it even easier for our information to be re-used.

In December 2016 we went further: we librarians agreed that in the future all Newcastle Libraries collections and documents published online would be made open by default. All public domain materials digitised from our collections will be clearly labelled as such when published. Materials created by library staff - images, event pictures, information booklets, training guides, etc. - will be published under a Creative Commons Attribution license. In 2017 we will start making more of our content available via platforms such as Flickr and GitHub.

Thursday, 28 September 2017

Supporting citizens with protecting their privacy online

This post is based on my talk at the CILIP Conference on 6th July, which I wrote-up for K & IM Refer: Journal of the Knowledge and Information Management Group (CILIP). This article has been published online as part of K & IM Refer Autumn 2017 issue.

All the technology around us - cameras, phones, our internet use, online communications, etc. - collects data about us. For example: most of us carry a smartphone around all the time. How many of us are fully aware that if the GPS is on, our phone company can pinpoint where we are with an accuracy of 5 to 8 meters? If the phone company knows, who may also have access to our location data? Are we comfortable with this situation? Would you change your behaviour and turn off your GPS when you don't use it now you know this, or would you decide the convenience outweighs the disadvantages?

Privacy is about choice. As citizens, we need to be aware of this situation to be able to make informed decisions about whether we want to protect some of our data and how much effort we are ready to put into protecting our privacy. Once we have the facts we also need the skills: we need to know about tips and tools available to help us protect our information.

Libraries defend people's rights
I believe that libraries exist to defend people’s right to enrich and improve their own lives, their environment and society. We library and information professionals make this happen by facilitating access to and the sharing of information, knowledge and culture.

In many sectors library and information professionals already devise and deliver digital skills training, ranging from a basic introduction to computers to searching online resources effectively. Knowing how to protect one’s privacy online is part of those digital literacy skills everyone should have; that's why at Newcastle Libraries we have started looking into how we could best help our citizens.

Learning about privacy issues and tools
Our team's awareness of privacy issues originally came from reading technology articles or from initiatives in libraries in other countries such as France or the USA. American librarians have created very useful materials that are a good place for us in the UK to start learning – I would particularly recommend the Library Freedom Project and the Data Privacy Project.
In Scotland the Scottish PEN has also been delivering Libraries for privacy: digital security workshops with support from CILIP Scotland and the Scottish Library and Information Council. I was able to attend one of those workshops, which inspired me to create a short training session for colleagues at Newcastle Libraries. I initially ran two sessions for librarians and senior managers in March 2017, and will be rolling it out to as many staff as possible this autumn. The first two sessions included time for us to discuss and decide what we wanted to do in our service regarding online privacy.

Initiatives for citizens
We wanted to offer information and training about protecting one's privacy online to local citizens. In 2016 we had already co-organised two cryptoparties; we decided we should host some more. A cryptoparty is an informal gathering of individuals to discuss and learn about tips and tools for privacy and security in our digital world. We co-organised ours with local members of the Open Rights Group who have the relevant technological knowledge that we might lack (!) - in partnership with the same individuals, our next cryptoparty will take place in November.
We have also noticed that cryptoparties tend to attract citizens who are already aware of privacy issues. How do we reach out to those who do not (yet) have that awareness? It is something that we are still exploring. One idea we want to implement is to include privacy among the topics covered in our digital skills sessions, but we are also trying to find other ways to, in a way, talk about privacy in a skills session without first telling people that we are.

Standing up for citizens' privacy
With Newcastle Libraries colleagues we felt that we could not be teaching citizens about tools to protect their privacy on the Internet and yet say: “By the way, this does not apply when you are using library computers or services”! We want to offer our computer users an Internet browser with enhanced privacy features – ideally, this would be Firefox with DuckDuckGo as the default search engine plus add-ons such as HTTPS Everywhere and Privacy Badger. I would love for us to offer Tor Browser or even for the library to be a Tor relay; however, I thought asking first for Firefox would be a lot less controversial... We are in conversation with our IT department; they have objections but these are about the practicalities of applying updates to the Firefox browser, which they cannot manage centrally like they currently do for Internet Explorer and Google Chrome.

An easier thing we can and will do is to be more transparent to citizens about how their information is handled when they use Newcastle Libraries services. When you use a library computer, you should be aware that our IT department records which websites you visit and that this information is kept for 12 months. When you use our e-books platform, we should tell you before you login what our supplier does with your data. It may take some time but it is relatively easy for us to add this kind of information on our website and other materials.
Once we start with this work we can review what we record – should we really be keeping your browsing history for this long? What is it used for; are we legally obliged to do so? Regarding third-party providers of library services, we should be requesting that they take steps to protect your data to our standards.
In truth, what we need is a privacy policy – the American Library Association Office for Intellectual Freedom has some fantastic information and templates adapted to the US context but that still gives us some useful pointers. Privacy terms and policies is a bigger piece of work but it is one we can build one chapter at a time, in order to support citizens with protecting their privacy online.

Wednesday, 21 June 2017

DataPrivacyNY, part 2: Privacy in a digital age seminar

I was very lucky to be invited by the Carnegie UK Trust to a study trip to New York on public libraries and online data privacy, which took place 15 to 19 May. In part 1 I wrote up my notes from the introductions to the topic and from a very useful meeting we had with the team at the New York Public Library.

On 17 May our group took part in a seminar entitled Privacy in a digital age which was held at the offices of the Carnegie Council for Ethics in International Affairs. The keynote speaker was Bruce Schneier, a technology security expert, with a response by Deborah Caldwell-Stone of the American Library Association (ALA) Office for Intellectual Freedom. When I first read the programme I thought: "Bruce Schneier? That sounds familiar... Isn't he the cybersecurity guy who wrote an afterword for Cory Doctorow's Little Brother?!" I may be a bit of a geek but: I was right - and his afterword, just like Doctorow's whole novel, is worth reading!

Note: the seminar was recorded - a transcript is available from the Carnegie Council, while the video of the full seminar (2 hours 12 minutes) and a highlights video (24 minutes) have also been published.

Albert Tucker from the Carnegie UK Trust was chairing the seminar. Ciara Eastell started it off with a short overview of the situation of public libraries in the UK and their role in privacy issues. She explained how public libraries are often the first and last resort for people to access online services and get support to do so. She highlighted the role of staff in providing this support, mentioning some Society of Chief Librarians (SCL) initiatives such as the training for all public library staff which accompanies the SCL Information Offer, the digital leaders training and the Innovators Network. However, privacy is not a topic staff are specifically trained on, and few UK public libraries have privacy policies.
Ciara frankly said that "the issue of data privacy is not one that ranks highly on the list of library leaders today" as austerity and budget cuts are much more pressing. 
But she also said that Newcastle Libraries [yes, that's us and fellow CryptoParty Newcastle organisers!!] are showing new possibilities regarding the potential of libraries around privacy.

Bruce Schneier started his speech by saying that all the technology around us - cameras, phones, our internet use, online communications, etc. - collects data about us. This data is increasingly easy to save and search, so much so that it is now easier to save everything than to figure out what to save. You can come back to this data later and search for specific words or patterns or incidents (this is mostly done by computers).
Bruce Schneier described metadata as data a system needs to operate. "Metadata is surveillance data", especially since "nobody ever lies to their search engine".
"Surveillance is the business model of the Internet."
Most of this data is held by corporations. We all know that the reason Facebook is free is that we are not the customer, we're the product. Data is valuable.
"Imagine if you had to alert the police every time you make a new friend... You laugh but you all alert Facebook."
The NSA and other similar organisations saw all this data being collected and thought of taking advantage of it. "Really we have a public-private surveillance partnership."
This situation has an impact on political liberty and justice, as well as causing problems of self-censorship. It also affects our security.

How do we fix this? We need security for privacy. And privacy is a part of security. We need to prioritise security over surveillance. Unfortunately secrecy means there isn't a robust debate in our society about this.
An example of this is all the talk about "encryption backdoors". Encryption backdoors are technically impossible: either you make a system secure or you make it not secure.

Our data together has enormous value to us collectively; our personal data has enormous value to us individually. Take medical data: it is very valuable for researchers when grouped together, yet sensitive for each individual when looked at separately.
"Data is the pollution problem of the information age": all processors produce it, it stays around. How do we deal with it? [I would not like to have to answer this question in an exam!!]

Deborah Caldwell-Stone then explained the position of librarians on the issue of privacy in a digital age. Librarians have a tradition of confidentiality; protecting user privacy has long been part of the focus of ALA and of the library profession.

According to a Pew research people trust their library - and use it to access to information.
Librarians are the intermediaries in the fight against surveillance. The main focus is on education, so people can make good decisions about protecting their privacy. For example, San José Public Library offers on its website a virtual privacy lab, which anyone can use to learn about privacy and generate a customised toolkit that fits their needs. The tools promoted include Privacy Badger, HTTPS Everywhere, DuckDuckGo, Tor Browser... 
"It does no good to teach someone about Tor Browser and not put Tor Browser on the libraries' computers."
Deborah Caldwell-Stone mentioned several initiatives, including:
  • the ALA's Choose Privacy Week, which is held annually in May to raise awareness of the issues and best practice among librarians;
  • the Library Freedom Project: training for librarians so they can then train their customers;
  • the Data Privacy Project at Brooklyn Public Library, which included training for librarians across New York City and is now an online course;
  • the work of Bill Marden at the New York Public Library on developing contracts with systems and resources suppliers that include privacy standards.

The seminar was then opened to questions and comments from participants; here are a few.
  • How do we make privacy a broader topic plus change perceptions of privacy as a concern reserved for "people in tin-foil hats"?
    Bruce Schneier: "Privacy is not about something to hide, it's about how I choose to present myself to the world."
  • "The privacy thing sometimes I feel I care about it more than other people do", said a participant [who wasn't me, I promise!!]
  • There is sometimes a tension with data and how useful collecting it and using it can also be for libraries.
    Deborah Caldwell-Stone recommended reading Becky Yoose's article on de-identification and patron data.
  • How can we reconcile the fact librarians are campaigning for privacy and pressure from government against privacy?
    Deborah Caldwell-Stone: it's the role of the professional association; ALA can say a lot of things that a local librarian can't. Some library directors have also been very good at pitching privacy as a bipartisan issue.
  • What should librarians do about being asked to give up one thing i.e. their or their users' information and privacy - if they want another e.g. a software product for their library?
    Bruce Schneier: "We have collectively decided that we were going to make the internet free in exchange for privacy" but we don't have to. Software and tools don't have to be built that way.
  • "I'd like to think that libraries will remain a sanctuary for privacy and freedom of information."

Closing remark from Bruce Schneier: privacy in a digital age is about changing perceptions. Librarians make powerful statements when: using warrant canaries, offering Tor Browser in libraries...

After the seminar our group got a chance for a more in-depth chat with Deborah Caldwell-Stone. She explained that for ALA the case for privacy and libraries started in 1939, with a Library Bill of Rights which included privacy in response to the situation in nazi Germany where librarians were being asked to inform the police on their customers. There have been several other cases of US librarians taking a stand for privacy since then e.g. 1950s, 1992 because of the Library Awareness Program, in 2001 in response to the Patriot Act... Deborah told us about the ALA Office for Intellectual Freedom (OIF) materials available on the topic, and made recommendations on what librarians could do.

  1. Check whether your institution has a privacy policy and whether it needs to be created or updated. OIF has a toolkit for US librarians on how to develop or revise their privacy policy.
  2. Look into encrypting your institution's own data and website. ALA has partnered with Let's Encrypt to help librarians do that.
  3. Implement best practice on different aspects of privacy in your institution. On the OIF website there are guidelines on best practice in relation to different topics e.g. e-book lending, library management systems, public access computers... For each set of guidelines there are corresponding checklists which summarise and prioritise (level 1, 2 and 3) what librarians should be looking to implement first.
    To put these measures in place you might need to figure out how to convince the chief IT person in your institution. Tip: pitch the idea in a way that benefits them e.g. it will improve security.
  4. Engage with your local community, create a place for dialogue around privacy. OIF has guides for hosting a discussion on privacy.
  5. Reach out to communities and provide opportunities for citizens to learn to protect their information.
  6. Advocate for better privacy laws, work with your legislator to change the law. Deborah described it as "grappling in the trenches with law makers and regulators"!