Thursday Community Call Recap 11.17.22

Hello everyone! Welcome to the weekly Thursday Community Call Thread. We really want to engage our community to think and learn about new topics in these calls. Unfortunately not everyone gets a chance to attend the call and those that do don’t always get a chance to speak due to time constraints.
There is value in continuing the types of conversation we start in these calls. We don’t want the call to be the last time people engage with the topics addressed. We want to enrich your days with inspiration and open eyes to new concepts. These topics might be applicable to some of your works or spark a personal interest. It is our goal to facilitate new ideas and creative thinking through this thread.

Community Call Link:

11/17 Community Call Recap: SCRF Mission and Vision - Beyond White Papers

This week’s community call was presentation and discussion on smart papers given by the folks from Blockscience Labs. Blockscience Labs is separate from Blockscience - they are a data science product company that builds solutions that support both Systems Engineers and Executive Scientists to make better business decisions through scientific processes. Our guests were Chris Frazier (CEO and Co-Founder) and Andrew Clark (CTO and Co-Founder). Definitely watch the video so you can follow along with a visual of the smart paper!

Modeling

Andrew began with an overview of a loss functions, particularly in a way to make unbiased models and deploy AI responsibly. The core focus was introducing why smart papers are beneficial in describing something like this. Smart papers allow code to be in the text, the ability to have plotting, and overall inclusion of instructiveness for the reader. They are trying to bridge gaps for non-technical people while showing high level concepts.

A brief overview of loss functions showed that loss functions are what calculates the error between actual values and predicted values. In machine learning, there are: regression models - predicting continuous output of loss values; classification models: predicting between 2 classes. These are the two basis for models. The loss function is the calculation for an outcome that is repeatable to predict the error over time.

For the discussion, they used a basic model of linear regression. Linear regression is a simple model that is basically a straight line (y = mX + c). When training the model, they are trying to find m and c because x is the given data. They use matrix notation to exemplify this:

image

They use MSE and MAE, two common loss functions, to generate 1,000 samples from the Gamma probability distribution. They will be training the linear regression model using gradient descent. Given the code, it comes out like this. This is randomly generated data, but it shows the example of what they are trying to achieve.

As they train with this data, both MSE and MAE functions have optimized the data for the aforementioned loss possibility. One point to note is that it is important to make the model non-biased against things like race or area statistics when in fact the model should be based on the performance of the individual. The model will use only the data it’s given to determine outcomes. Multi-objectives are introduced as a way to combat this. Without sacrificing fairness, you need to put in another object: being fair and non-biased.

There is a lot more to research on this topic, so anyone interested be sure to reach out the Andrew about this!

Smart Papers

So this previous section was very complex, even more so in the call. How do we take some type of complex topic like this and create a more digestible way to absorb the information? Smart papers.
A smart paper is a ‘simulated enabled’ white paper. You are still presenting analysis and results in such a way that you are creating a digital interactive museum. They are observable AND interactable so that people can walk away with better learning of the topic.
Something like a whitepaper or a litepaper is turned into a multimedia document that is accessible to all levels of understanding.

Who is this for?

With a smart paper, there is a marketing challenge of finding the people who want what you have. Chris proposes that these papers can help engage a community and create better interest. This is also a way to change how research and journalism is shared. There could be something to interact with that makes all the difference in say, an article.

This allows people to test assumptions and play around with parameters of a working ecosystem all while reading and learning about them. It increases the legitimacy of the project.

Chris shows an example of a smart paper that was built in relation to the Grand Ethiopian Renaissance Dam. It combines many different elements (Jupyter Notebook, Javascript, python, text, etc…) This example is interactable and holds true to all the aforementioned components of a smart paper that allows growth of this theoretical project.

The first goal is to increase the audience of a given topic, through the multi-media document. The eventual goal is how to scale it.

A few questions to consider from the call

  • How do you think projects can benefit from Smart Papers that were not mentioned?
  • How could SCRF benefit from smart papers?
  • What other elements can be included to make smart papers more accessible to non experts?
  • How do we keep data non-biased?
  • How can smart papers be scaled?

If any of these spark your interest or you have other questions / thoughts, please discuss below!

3 Likes