Skip to main content

dkernohan

#opened14 - third day keynote from John Willbanks

6 min read

John Willbanks currently works at Sage Bionetworks. He was asked to speak about open science and open data.

He started by cautioning against "open silos", different campaigns using common tools and approaches but not speaking to each other. Science effects education, and both are affected by wider culture, and the culture of prediction.

Yogi Berra - "predictions are hard. especially about the future". (It is now easy to find older books to source quotations via web seachers, though nothing from the last 25 years)

It *was* really hard to make predictions about the future. But predictions are increasingly accurate - especially predictions about ourselves. Every single website is trying to sell you the same thing - it's not like they know you, they literally know you. Mining things like email data to make predictions has exploded over the past 10 years.

This is about probability. And this is basic mathematics.

Increasingly fields are, or can be, data driven. Biology used to be a narrative science, now with the advent of cheap shared data, it is a predictive science. He gave the example of services like "23andMe", consumer genetics. Or Science Exchange - ebay for university science services.

It now costs $200 per sample to do RNA microarray. Tools for science and analysis are cheaper.

Not just hard science. In Archaeology there are huge amounts of archive data. Even etymology we can find the origin of quotes.

Everything is text. So every field has a data wave coming. Everything is increasingly measurable and indexable.

So probabilistic analysis is going to be the academic coin of the realm. And advertising is making the methods and tools more accessible.

Probability changes every time we add new information to the model. This changes educational culture, and changes the needs for training and skills. He said that current pedagogy is failing - there is no continuing education for sciences. So it is hard for academics to deal with the data flow.

In the sharing economy, a larger market makes for a better economy. Though these are rental economies, not good for labour or conditions. And service owners don't want you to be a buyer and a seller - in science, we want to be able to be both.

These markets are better (for buyers) than the terrible status quo. But this isn't good enough. Open multi-sided platforms allow individual actors to have multiple roles.

In the open movement, we don't focus on adding users. We need lightweight ways to move people in ways like shifting from being a wikipedia viewer to a wikipedia contributor. And getting value from both sides - increasingly as more people are involved.

It is not about the assets (or the license choices), it is about the users. And these may be people who don't agree with us philosophically. He gave the example of open source - methodologically and economically it succeeded. The philosophy is great, but it wasn't that that drives growth.

Willbanks asked of any "open" activity - "does it create more value than a closed version?". Openness is a methodology that gets assets and data in front of people.

So selling value rather than philosophy is selling a practice change. Work at Merck on cancer is open, via a non profit operation. It allows anyone to use genetic data.

Analytic tools to analyse this data need to be used alongside experience - so can we create an open multi-sided market to bring these together - not just solo labs (as the natural unit of science) but communities. Government funding now works to foster collaboration, and open approaches can simply play into this (eg TCGA Pan-Cancer Consortium).

In this example, open methods allowed the consortia to analyse data collaboratively, buy instigating a culture or sharing clear information. So science practice has improved via open approaches. Using approaches like version control for annotations and metadata. Allowing researchers to see every stage, allowing us to be confident in probabilistic analysis.

And for researchers not used to these ways of working, this practice sucks. It is new, and slow. But the value realised in terms of academic activity (papers etc) is immense. And this led it to gain users from across TCGA.

This was a community that was required to work together, but what about those that are not. In colon cancer we saw 4 (or more!) simultaneous papers postulating different genetic subtypes for the disease. Open approaches allowed groups to test their methods across all of the 13 data sets. So a consensus subtype, with high probabilistic confidence, emerged.

The approach is now exploding across research groups. And it makes challenges possible to widen communities - more eyes on the problem. For example computing the probability of cancer relapse. The winner (with the competition as peer review) gets a guaranteed high-impact journal publication, but code sharing is required to be eligible.

The winner actually got a cover, an opinion piece, a methods paper and a results paper. And an entire suite of tools was generated (even from outside medicine) for others attaching the problem - the winning entry was from the lab that invented the mp3 codec.

If you have an open player in the market, it changes and improves the market.Less immoral. Less asshole-y.

So we need to think about our practice - how do we govern open platforms? How do we design and cost them? Willbanks felt that the biggest challenge the open movement faced was platform design, to drive engagement. The iPhone was not designed around the idea of a closed ecosystem - it was designed around value to the user.

With an open platform, you are not just a buyer or a seller. You are a citizen. You are a member. And good design means you are the priority.

Licenses like BY and 0 give users more value. And a winning design can embed this into places where open had not been previously considered. He gave the example of informed consent (which reminded me of early UK work on the consent commons), claiming that better designed forms would make it easier to find research participants, allowing for larger scale (and thus more probabilistically confident findings).

This led to collections of noun, verb and sentence icons and animations, and storyboard templates, put into the public domain. Allowing the simple creation of stories that can properly inform consent. Using mobile technology and sensors to gather and analyse research data (for example gyroscopic sensors to measure hand tremors in Parkinsons patients).

As a fully open tool, these informed consent approaches can be used in a variety of contexts. Allowing other people to do things that the product creators cannot do, or had not considered. Again, an open method creates more value. Economic value, educational value.

In probability, adding more data refines the model. But what we "know" becomes less stable as more data is added, so pedagogy needs to change to reflect this emerging ontological instability. So the right to reuse becomes the right to be current, and to get better, and to create value.

And value is not just economic in open systems - it is social value and knowledge value.

dkernohan

#opened14 - second day keynote with Heather Joseph

5 min read

Heather Joseph is the executive director of SPARC. Her background is in publishing, including (only!) 11 months with Elsevier. 

The Open Access movement has been deliberately focused on journal articles as a primary academic output. But they have been very aware that they are not operating in isolation.

This presentation focused on Heather's experiences, and the lessons that have been learned - looking at the parallels between OER and Open Access. It looked to highlight opportunities for collaboration between the two movements.

Technology has been a major driver for changes in scholarly communications. People are sharing academic work via commercial social media. And this is not just sharing work, this is doing work. Ending up with a "whole lot more digital stuff".

Heather gave the example of the human genome as a case study for means of dealing with this data deluge, and the issues that arise. Between 2005 and 2008 the amount of findings taken from the digitised human genome grew exponentially. Submissions to GeneBank also grew exponentially. This put an enormous amount of pressure on the way we share information - there is too much to sit and read articles in a linear fashion.

Enter the concept of the computer as a reader, huge implication for copyright.

A further driver has been the prohibitive cost of journals, similarly to textbooks. Leasing annual access to journals is astonishingly expensive, and a grown by 340% over the last 14 years. An outcomes is that we all run into paywalls when looking for research.

So what do we do, we ask the author for a copy, ask a colleague who can access - or go to on twitter. Or we skip the article and move on. We are operating a system that forces workarounds - we need to optimise the system so it works for scholars.

The Open Access movement is trying to do this. Heather showed the Budapest Open Access Initiative definition (2002). Shortened as "immediate availability plus full reuse".

Enabling strategies have included OA journals and repositories, and policy lobbying.

OA Journals are an alternative to the existing system - offering the same standards as traditional journal, plus free and full access and reuse. Most are available under a CC-BY license.

Repositories are a key component of the infrastructure, allowing authors to make articles accessible and to see them preserved and shared. They are digital collections, that now include things like data, and teaching and learning materials. Interoperability is essential.

Mashing up DOAR and Google Earth shows a healthy infrastructure - though interoperability still needs work. As this infrastructure has grown, so has policy maker interest.

Policy Makers are often focused on maximising social returns on public investment (OECD 2005) by making research findings more widely available. This has enabled an international policy focus based around public entitlement. NIH mandated OA publication in 2008 basted on this pressure, and now all federal agencies are now required to issue similar policies.

After 10 years we have built a lot. Use and trends are increasing. But we are realistic that there is a lot more work to do. Since 2013 only one of the federal agencies has released an OA plan. Only 45 institutions have an OA policy in the US. Less than 20% of articles are deposited in open repositories.

And, as Larry said yesterday: "They're coming for you".

The academic publishing industry is worth $9.4bn, a similar size to the NFL. And they want to preserve this revenue. Publishers like Elsevier are making funding contributions to legislature. The publishing lobby (note there are some commercial publishers that are trying to do the right thing) has a huge "war chest" for influencing policies. Money does buy influence.

The lobby has spent their money on PR (Dezenhall) engaged for a 6 month period in 2007 to run a media messaging campaign against OA. It was noted that the OA message was almost bulletproof - and that the "messages didn't have to be true" to be effective. Ridiculous messages that needed to be rebutted. Money can also buy distraction.

These things will happen, and we will be able to overcome them.

In 2007 SPARC was working with "Students for Free Culture", an organisation inspired by Larry Lessig, They successfully sued Diebold over the the 2000 election. They defended Tom Forsyth against Mattel for using Barbie images - the "Barbie in a blender" day of action.

SPARC and SFC ran a small campaign on the prices of journals. So PR called SPARC "Barbie-Blenders".

But how can we keep winning? We keep winning if we work together, if we build our communities. (wide communities, noting that early-career researchers are key.)

We win when we build better resources for our communities to work with. And these become the preferred resources. The OA campaigns work openly themselves, and this is a strength.

A closing story about Lego demonstrated the benefits of being able to take the pieces apart and put them back together. Lego now looks interoperable with MegaBloks, but they are not. The specs look close, but they are not exact, so structures can collapse. Open campaigns need to adopt the same specifications (technically and legally) to allow bigger structures to be built.

opened14

dkernohan

#opened14 - first day keynote with Larry Lessig

5 min read

The only person that still has the power to get Lessig to talk about copyright issues is David Wiley. He took the chance to think back to the time he worked on this issue - it was the good and the bad together, working with people who want to create things and people who want to stop them.

Aaron Schwartz was a real organiser in the early days, insisted on a focus on "grabbing the theoretical and making it practical".

Around the time of the "Laws that choke creativity" TED talk, he asked "how do you think you are ever going to achieve what you are trying to do whilst government is still corrupt". Lessig said it wasn't his field - but Schwartz suggested it was his field as a citizen.

So the next chunk of his life was devoted to examining this problem. It was like giving up the hopeful part and focusing on the depressing part.

"Tweedism" - a single world to underline the problems with government. "I don't care who does the electing as long as I do the nominating." The question is the whether the filter in between nomination and voting is biased.

Parallel with emancipation - there were all-white primaries before the general election in Texas 100 years ago. So democracy was responsive to whites only.

Again parallels with the Hong Kong umbrella protests. These are protests about tweedism. 0.24% of electorate get to nominate.

We take for granted in the US that campaigns are privately funded. And getting funding is the first stage of the process. Showing the "skinner box" as a metaphor for people knowing how to work the system to get to funding. 30-70% of congress candidates time is spent "calling" for funding,

About 150,000 people in the US (will fall to 35,000) are funders who give important amounts. A tiny fraction of the 1% control the first stage of the system - the "green primary".

Gilens and Page (Princeton) showed that if economic elite or organised interest group preference is high, a policy will pass. But large support from voters has no effect.

Income distribution across cycles is - frankly - terrifying. And the reason for this is changes in government policy. Set by an economic elite.

Lessig was focused on this through the lens of copyright - since the 90s Sonny Bono Copyright Extension Act. Did this advance the public good? Economists overwhelmingly said no (even Milton Friedman). But congress passed it because there was a financial interest and lobbying.

People like Hal Plotkin have won important victories in Open Education advocacy in the Obama administration. But in other ways things have got worse. Some blame the revolving door between civil servants and industry trade groups.

Not all provisions in US law are exported, rights holder provisions are exactly copied across, those for users less so.

From "Free Culture", Lessig argued that the difference between fair use and free uses. Fair uses are uses that would otherwise be regulated, primarily around making copies. But in the digital age everything you do produces a copy, so everything is "presumpted" to be regulated. 

Patterson argued that the insertion of "copy" in 1909 was a mistake. But this "mistake" has led to the extreme regulation of "temporary copies". An absurd position in the digital age. The Obama administration is currently arguing to enshrine this in law.

Hollywood lobbyists have argued that not helping Hollywood (eg SOPA) would lead to Hollywood not helping governments.

Could the department of labour require that new education content commissioned ($100m) be CC-BY? There was a clause (124) that suggested that the government should check that no commercial content should exist in these spaces. Was argued down. But we were "Not important" enough to be defeated.

But this is not just about you. It's pouring honey in swiss watch. It's stopping processes working for the popular good, because blocking is easy for economic elite interests.

Fukyama talks about a "vetocracy" - it is easy to block sane policy because of the way funding works. The democracy part just doesn't matter.

The solution would be to change the way campaigns are funded. To pass 1 statute, to decentralise campaign funding. With public funds. The "obvious and first answer".

But what explains the failure? - pundits say that "people don't care". The Mayday PAC focused on proving this, it wasn't seen to work in the 2014 elections.

The question is not "do they get it?" The question is "will they [WE] do something about it?". It's not because we like it, it is because we are resigned to it. What it means to "grow up", to "accept the reality" of modern life and corruption. How do we resist this?

1. Talk about feasible change. (eg a statute first, not an amendment)

2. Focus on ideals that inspire. 

3. Teach.

To recruit the audience, asking for 10% of our effort to focus on this underlying problem. If you want to make the world better - you need to make it possible to make the world better.

Against all odds, we at opened are fighting against a large producer interest for something that makes sense. What we need to talk about, to rediscover, is a special different sense of the word "Hope".

Vaklav Havel - Hope is definitely not the same thing as optimism. It is not the conviction that something will turn out well, but the certainty that something makes sense, regardless of how it turns out.






opened14