Quickstart: Creating a Project

This is a short guide about how you can create a project in a PYBOSSA server. Readers may wish to start with the Step by step tutorial on creating an Project which walks through creating a simple photo classification project if they want to understand all the details about how you create a project.

First of all we have to create a project for the project. A project represents a set of tasks that have to be resolved by people, so a project will have the following items:

  1. Name,
  2. Short name or slug, and
  3. Description

The slug or short name is a shortcut for accessing the project via the web (short urls like this http://domain.com/project/slug).

The description is a short sentence that will be used to describe your project (think about it like a Tweet long description).

A project can be created using two different methods:

Using the Web Interface

Creating a project using the web interface involves four steps:

  1. Creating the project,
  2. Import the tasks using the simple built-in Task Creator (uploading a CSV file or Google Spreadsheet link exported as CSV),
  3. Write the Task Presenter for the users, and
  4. Publish the project.

Creating the project

In order to create a project in PYBOSSA via the web interface you have to:

  1. Create a local account in your PYBOSSA server:

PYBOSSA Register

Another alternative is to use your Twitter, Facebook and Google sign in methods, (if the server has enabled them, see the documentation: Enabling Twitter, Facebook and Google authentication).


PYBOSSA sign in methods

  1. Once you have an account, click in create link of the top bar.
  2. After clicking in the previous button, you will have to fill in a form with the very basic to create your project:
    1. Name: the full name of your project, i.e. Flickr Person Finder.
    2. Short Name: the slug or short name used in the URL for accessing your project, i.e. flickrperson.
    3. Long Description: A long description where you can use Markdown to format the description of your project. This field is usually used to provide information about the project, the developer, the researcher group or institutions involved in the project, etc.

PYBOSSA Create link

Note

PYBOSSA usually provides two Categories by default: thinking and sensing. The thinking category represents the standard PYBOSSA project where users contribute helping with their skills. Sensing category refers to projects that are using a volunter sensing tools like EpiCollect or Raspberry Pi with PYBOSSA for gathering data.


  1. Once you have filled all the fields, click in the Create the project button, and you will have created your first project.

After creating the project, you should be redirected to the Settings project page, where you will be able to customize your project by adding some extra information or changing some settings. There, you will find a form with the same fields as in the previous step (just in case you’ve changed your mind and wanted to change any of them) plus the following:

  • Description: A short description of the project, e.g. A project to classify cancer cells. By default, it will have been autogenerated for you from the Long description you filled in the previous step (but without the Markdown!).
  • Allow Anonymous Contributors: By default anonymous and authenticated users can participate in all the projects, however you can change it to only allow authenticated volunteers to participate.
  • Password: If you want to control who can contribute or access to your project, you can set a password here to share with those you allow to do it. If you leave it blank, then no password will protect your project.
  • Category: Select a category that fits your project. Categories are added and managed by the server Administrators.
  • In addition, you will be able to select and upload an image from your local computer to set it as the project image thoroughout the server.

PYBOSSA Project Update page

Importing the tasks via the built-in CSV Task Importer

Tasks can be imported from a CSV file or a Google Spreadsheet via the simple built-in Task Creator. You have to do the following:

  1. Navigate to your project’s page (you can directly access it using the slug project name: http://server/project/slug).
  2. Click in the Tasks section -on the left side local navigation menu:

http://i.imgur.com/nauht7l.png

  1. And click again on the Import Tasks card. After clicking on it you will see several options. The first ones are for using the different kinds of importers supported by PYBOSSA: Amazon S3, Twitter, Dropbox, Flickr, Youtube, Google Spreadsheet, CSV url, and EpiCollect Plus.

http://i.imgur.com/eWBxSyS.png

For example, the Flickr importer will allow to import a Flickr album by typing its ID or if you have an account, by logging into Flickr and showing your own public (and creative commons licensed) albums:


http://i.imgur.com/lF9LJVO.jpg

Select one of the albums, click import and all the pictures will be imported as tasks for your PYBOSSA project. As simple as that.

The other importers are very similar. In most cases you’ll provide a URL to the resource, like for the CSV and Google Spreadhseet importer, while the Dropbox, Amazon S3, Twitter, Youtube, and EpiCollect Plus importers will have a nice interface to importa data automagically for you.

Note

If you’re trying to import from a Google Spreadsheet, ensure the file is accessible to everyone via the Share option, choosing: “Public on the web - Anyone on the Internet can find and view”

Note

Your spreadsheet/CSV file must contain a header row. All the fields in the CSV will be serialized to JSON and stored in the info field. If your field name is one of state, quorum, calibration, priority_0, or n_answers, it will be saved in the respective columns. Your spreadsheet must be visible to public or everyone with URL.

In the Task Importer section, you’ll find also other pre-loaded with Google Spreadsheets URLs. Those templates are examples that you can use to learn how to create your own spreadsheets and import data for image, sound, video, pdf mining and mapping projects.


http://i.imgur.com/eGwKDpB.png

By using these templates, you’ll be able to learn the structure of the tasks, and directly re-use the Task Presenter templates that know the structure (name of the columns) for presenting the task.

Additionally, you can re-use the templates by downloading the CSV files from Google Docs, or even copying them to your own Google Drive account (click in File -> Make a copy in the Google Doc Spreadsheet). The available templates are the following:

Note

If you import again the same URL, only new records will be added to the project.

Importing the tasks from an EpiCollect Plus Public Project

EpiCollect provides a web project for the generation of forms and freely hosted project websites (using Google’s AppEngine) for many kinds of mobile data collection projects.

Data can be collected using multiple mobile phones running either the Android Operating system or the iPhone (using the EpiCollect mobile app) and all data can be synchronised from the phones and viewed centrally (using Google Maps) via the Project website or directly on the phones.

EpiCollect can help you to recollect data samples according to a form that could include multimedia like photos. Moreover, EpiCollect can geolocate the data sample as it supports the built-in GPS that all modern smartphones have.

For example, you can create an EpiCollect project where the form will ask the user to take a picture of a lake, geo-locate it automatically via the smartphone built-in GPS and upload the picture to the EpiCollect server. If the user does not have Internet access at that moment, the user will be able to synchronize the data afterwards i.e. when the user has access to an Internet WIFI hotspot.

PYBOSSA can automatically import data from a public EpiCollect Plus project that you own or that it is publicly available in the EpiCollect web site and help you to validate, analyze, etc. the data that have been obtained via EpiCollect.

If you want to import the data points submitted to a public EpiCollect project, you will have to follow the next steps:

  1. Navigate to your project’s page (you can directly access it using the slug project name: http://server/project/slug).
  2. Click in the Tasks section -on the left side local navigation menu:
  3. And click on the Import Tasks button. After clicking on it you will see several different options:

4. Click in the Use an EpiCollect Project one. |

http://i.imgur.com/A50La7O.png

  1. Then, type the name of the EpiCollect project and the name of the form that you want to import, and click in the import button.

All the data points should be imported now in your project.

Note

EpiCollect projects will be gathering data mostly all the time, for this reason, if you import again the same EpiCollect project, only new data points will be imported. This feature will allow you to easily add new data points to the PYBOSSA project without having to do anything special.

Importing the tasks from a Flickr photo set

PYBOSSA also allows to import tasks for projects based on images (like image pattern recognition ones) directly from a Flickr set (also called album).

When importing tasks from a Flickr set, a new task will be created for each of the photos in the specified set. The tasks will include the following data about each picture (which will be later available to be used in the task presenter):

  • title: the title of the photograph, as it appears on Flickr.
  • url: the url to the raw .jpg image, in its original size.
  • url_b: the url to the image, ‘big sized.
  • url_m: the url to the image, ‘medium’ sized.
  • link: a link to the photo page in flickr (not to the raw image).

You can import tasks from a Flickr photo set (a.k.a. album) in either of the following ways:

The easiest one is to give the PYBOSSA server permission to access your Flickr list of albums. To do so, you’ll have to log in to your Flickr account by clicking the “Log in Flickr” button. Then you’ll be redirected to Flickr, where you will be asked if you want to allow PYBOSSA to access your Flickr information. If you say yes, then you’ll be again redirected to PYBOSSA and you’ll see all of your albums. Choose one of them and then click the “Import” button to get all the photos created as tasks for your project.

Note

Next time you try to import photos using the Flickr importer, you’ll see the albums for your account again. If you don’t want PYBOSSA to access them anymore, or just want to use another Flickr account, then click “Revoke access”.

Another option to import from a Flickr album is by specifying the ID of the set (album) directly. This option is a bit more advanced (don’t be afraid, it is still very easy if you follow the next steps) and it allows you to import from a photo set that you don’t own (although, it will have to be public. Also check the rights of the photos on it!). Another advantage is that you don’t need to log in to Flickr, sou you don’t even need to have a Flickr account.

These are the steps:

  1. Navigate to your project’s page and click in the Tasks section:
  2. Then click on the Import Tasks button, and select the Flickr importer:
  3. Log in with your Flickr ID and select one of the available albums of your account, or type the ID of the Flickr set you want to import the photos from, then click on the import button:

http://i.imgur.com/UZRBj8y.png

If you cannot find the ID or don’t know what it is, just browse to your Flickr photo set and check the URL. Can you see that last long number rigth at the end of it? That’s what you’re looking for!


http://i.imgur.com/h6qNDX2.png

And all the photos will be imported to your project. Just like with the other importers, each task will be created only once, even if you import twice from the same Flickr set (unless you add new photos to it, of course!).

Note

You will need to make sure that every photo belonging to the set has the visibility set to public, so the PYBOSSA server can then access and present them to the volunteers of your project.

Importing the tasks from a Dropbox account

You can import tasks from arbitrary data hosted on a Dropbox account with the Dropbox importer. When importer tasks like this, the following information will be added to the info field of each tasks, available later to be used in the task presenter of the project:

  • filename: just it, the name of the file you’re importing as a task.
  • link: the link to the Dropbox page showing the file.
  • link_raw: the link to the raw file served by Dropbox. This is the one you’ll have to use if you want to direct link to the file from the presenter (e.g. for using an image in a <img> tag, you’d do: <img src=task.info.link_raw>).

In addition to this generic information, the Dropbox importer will also recognize some kind of files by their extension and will attach some extra information to them.

For pdf files (.pdf extension), the following field will be obtained too:

  • pdf_url: direct link to the raw pdf file, with CORS support.

For image files (.png, jpg, .jpeg and .gif extensions) the following data will be available:

  • url_m: the same as link_raw
  • url_b: the same as link_raw
  • title: the same as filename

For audio files (.mp4, .m4a, .mp3, .ogg, .oga, .webm and .wav extensions):

  • audio_url: raw link to the audio file, which can be used inside an HTML 5 <audio> tag and supports CORS.

For video files (.mp4, .m4v, .ogg, .ogv, .webm and .avi extensions):

  • audio_url: raw link to the video file, which can be used inside an HTML 5 <video> tag and supports CORS.

The tasks created with the Dropbox importer are ready to be used with the template project presenters available in PYBOSSA, as they include the described fields.

Thus, importing your images from Dropbox will allow you to immediately use the image pattern recognition template with them; importing videos, audio files or pdfs with the Dropbox importer will also grant you to use the presenter templates for video pattern recognition, sound pattern recognition or documents transcription, respectively, with no additional modifications and have them working right away (as long as the files have any of the mentioned file extensions, of course).

These are the steps:

  1. Navigate to your project’s page and click in the Tasks section:
  2. Then click on the Import Tasks button, and select the Dropbox importer:

3. Click on the “Choose from Dropbox” icon. You will be asked your Dropbox account credentials. then select as many files as you want:


http://i.imgur.com/dsgM0Tg.png

4. You can repeat step 3 as many times as you want, and more files will be added to your import. Then, click on “Import”.

Importing the tasks from a Twitter account or search result

Another option for importing tasks is using the built-in Twitter importer. It allows to import tweets as tasks from either a specified Twitter user account, or from the results returned from a search to the Twitter search API.

Tasks imported with it will have the tweet data attached to their info field, and can later be used from within the task presenter. This data is a direct transcription of the data returned by the Twitter API, in particular a Tweet object.

Please notice that the values returned by the Twitter API may vary. However, the following fields are guaranteed to be always included in the info field of the tasks:

  • created_at: the date and time the tweet was made.
  • favorite_count: number of times the tweet has been marked as ‘favorite’.
  • retweet_count: number of times the tweet has been retweeted.
  • coordinates: geographic coordinates of the place the tweet was made from. Note that this is not always available for every tweet.
  • tweet_id: the internal ID handled by Twitter to identify this tweet.
  • user: an object with information about the tweet author, as returned by the Twitter API.
  • text: the actual content of the tweet.

In addition, an extra field “user_screen_name” has been added to the info field:

  • user_screen_name: the screen name (or ‘handle’) of the author of the tweet.

For more information, please refer to the Twitter_ documentation. .. _Twitter: https://dev.twitter.com/

Note

When importing tweets from a search, retweets will be ignored!

So, to import tasks with the Twitter importer, do as follows:

  1. Navigate to your project’s page and click in the Tasks section:
  2. Then click on the Import Tasks button, and select the Twitter importer:
  3. You can provide your own Twitter credentials and make API requests in behalf of them, or use the credentials provided by us. (The later only allows to import the number of tweets returned by a single Twitter API call, which is 100 for searches and 200 for user timelines.)

4. Fill in the two fields you will find in the form. The first one is for the source of your tweets. If you want them to be imported from a user account, then write it with the “@” symbol, like “@PYBOSSA”. If you just want to import tweets containing a word on them (or a #hashtag), then type it there. The second field is for you to specify how many tweets you want to import. You can import as many as you want!

Finally, click on the “Import” button, and you are done:


http://i.imgur.com/l5PG2WX.png

Importing the tasks from an Amazon S3 bucket

Tasks can be imported from data hosted on the Amazon S3 service. Similarly to the Dropbox importer, these tasks can use different kind of data, like images, videos, audios, PDF files, etc. hosted on any S3 bucket.

The S3 importer will work pretty much the same as the Dropbox one. When using it, the created tasks will contain the following data in the info field:

  • filename: just it, the name of the file you’re importing as a task.
  • link: the link to the raw file served from Amazon S3.
  • url: same as the above.

In addition to this generic information, the S3 importer will also recognize some kind of files by their extension and will attach some extra information to them.

For pdf files (.pdf extension), the following field will be obtained too:

  • pdf_url: direct link to the raw pdf file.

For image files (.png, jpg, .jpeg and .gif extensions) the following data will be available:

  • url_m: the same as link.
  • url_b: the same as link.
  • title: the same as filename.

For audio files (.mp4, .m4a, .mp3, .ogg, .oga, .webm and .wav extensions):

  • audio_url: raw link to the audio file, which can be used inside an HTML 5 <audio> tag.

For video files (.mp4, .m4v, .ogg, .ogv, .webm and .avi extensions):

  • audio_url: raw link to the video file, which can be used inside an HTML 5 <video> tag.

The tasks created with the S3 importer are ready to be used with the template project presenters available in PYBOSSA, as they include the described fields.

Thus, importing your images from S3 will allow you to immediately use the image pattern recognition template with them; importing videos, audio files or pdfs with the S3 importer will also grant you to use the presenter templates for video pattern recognition, sound pattern recognition or documents transcription, respectively, with no additional modifications and have them working right away (as long as the files have any of the mentioned file extensions, of course).

Importing from an S3 bucket requires that the bucket visibility is set to public so its content can be seen and listed by PYBOSSA. To make a bucket public, go to your AWS management console and select the S3 service. Then select the bucket you want to make public and click on “Properties”. Click on “Add more Permissions” and add a new one with “Grantee: Everyone” and the “List” checkbox selected, like in the following image:


http://i.imgur.com/FuN9XAS.png

You may also need to enable CORS in the bucket. In the same menu as above, click on “Edit CORS Configuration” and configure it. You can learn more here.

Finally, you need to make sure that every file inside the bucket that you want to use in a task has a public link too. Go to the bucket content and select the files. Then click on “Actions” and select “Make Public”. Your files will now be visible for everyone, including a PYBOSSA server.


http://i.imgur.com/AHBVQCk.png

Once your S3 bucket is ready, you can follow these steps to import tasks from it:

  1. Navigate to your project’s page and click in the Tasks section:
  2. Then click on the Import Tasks button, and select the S3 importer:

3. Type the name of the bucket from which you will be importing your tasks and click on “Search in bucket”. If you followed the steps above and your bucket is public, you will see a list of the items it contains. Select as many as you want:


http://i.imgur.com/6RAMqd9.png

  1. When you’re ready, click on “Import”.

Importing the tasks from Youtube

Tasks can be imported from Youtube. Currently the importer supports importing from Youtube with:

  • Playlists

When importing the video the importer parses all videos information and creates tasks with info fields:

  • video_url: the URL of the youtube video which can be embedded in the task form.
  • oembed: embeddable code for the (old) PYBOSSA video templates.

The tasks created with the Youtube importer are ready to be used with the youtube and video templates.

Flushing all the tasks

The project settings gives you an option to automatically delete all the tasks and associated task runs from your project.

Note

This action cannot be un-done, so please, be sure that you want to actually delete all the tasks.

Note

This action will only allow you to delete tasks that are not associated with a result. When a result is created, that task and its task runs cannot be deleted so the volunteers can always have access to their contributions.

If you are sure that you want to flush all the tasks and task runs for your project, go to the project page (http://server/project/slug/tasks/) and click in the Settings option of the left local navigation menu:


http://i.imgur.com/nauht7l.png

Then, you will see that there is a sub section called: Task Settings and a button with the label: Delete the tasks. Click in that button and a new page will be shown:


http://i.imgur.com/DKPV6dc.png:width:100%

As you can see, a red warning alert is shown, warning you that if you click in the yes button, you will be deleting not only the project tasks, but also the answers (task runs) that you have recollected for your project. Be sure before proceeding that you want to delete all the tasks. After clicking in the yes button, you will see that all the tasks have been flushed.

Creating the Task Presenter

Once you have the project and the tasks in the server, you can start working with the Task Presenter, which will be the web project that will get the tasks of your project, present them to the volunteer and save the answers provided by the users.

If you have followed all the steps described in this section, you will be already in the page of your project, however, if you are not, you only need to access your project URL to work with your project. If your project slug or short name is flickrperson you will be able to access the project managing options in this URL:

http://PYBOSSA-SERVER/project/flickrperson

Note

You need to be logged in, otherwise you will not be able to modify the project.

Another way for accessing your project (or projects) is clicking in your user name and select the My Projects item from the drop down menu. From there you will be able to manage your projects:


PYBOSSA User Account

http://i.imgur.com/9sO21Zd.png

Once you have chosen your project, you can add Task Presenter by clicking in the Tasks local navigation link, and then click in the button named Editor under the Task Presenter box.


http://i.imgur.com/nauht7l.png

After clicking in this button, a new web page will be shown where you can choose a template to start coding your project, so you don’t have to actually start from scratch.


http://i.imgur.com/psC5m6Q.png

After choosing one of the templates, you will be able to adapt it to fit your project needs in a web text editor.


http://i.imgur.com/g9gAvWw.png

Click in the Preview button to get an idea about how it will look like your Task Presenter.


http://i.imgur.com/DsDDBia.png

We recommend to read the Step by step tutorial on creating a Project, as you will understand how to create the task presenter, which is basically adding some HTML skeleton to load the task data, input fields to get the answer of the users, and some JavaScript to make it to work.

Publishing the project

After completing the previous three steps, your project will be almost ready. The final step is to publish it, because now it will still be a draft, and it will be hidden to everyone but you (and admins).

When your project is a draft, you can contribute to it and the answers (task runs) and results will be stored in the database so you can have access to them (and test the webhooks solution if you want to do real-time analysis). However, in the moment of publishing the project all the task runs and results (as well as the webhooks log entries) will be flushed, so don’t be afraid and try it as much as you can until you are sure that everything works as expected. Once you think the project is ready for the world to see it, just click in the Publish button:


http://i.imgur.com/A7m4aa6.png

Note

Publishing a project cannot be undone, so please double check everything before taking the step.

Note

You can allow other users to give you feedback and let them try and see your project before it has been published. In order to do so, just protect it with a password, and people will be able to access it (as long as they have the password, of course).

After publishing it, you will be able to access your project using the slug, or under your account in the Published projects section:

Also, results will begin to be created every time a task is completed. Enjoy!

Using the API

Creating a project using the API involves also four steps:

  1. Create the project,
  2. Create the Task Creator, and
  3. Create the Task Presenter for the users.
  4. Publish it. This needs to be done via the web interface. For more details please refer to Publishing the project.

Creating the project

You can create a project via the API URL /api/project with a POST request (See RESTful API).

You have to provide the following information about the project and convert it to a JSON object (the actual values are taken from the Flickr Person demo project):

name = u'Flickr Person Finder'
short_name = u'FlickrPerson'
description = u'Do you see a human in this photo?'
info = { 'task_presenter': u'<div> Skeleton for the tasks</div>' }
data = dict(name = name, short_name = short_name, description = description, info = info, hidden = 0)
data = json.dumps(data)

Flickr Person Finder, which is a demo template that you can re-use to create your own project, simplifies this step by using a simple file named project.json:

{
    "name": "Flickr Person Finder",
    "short_name": "flickrperson",
    "description": "Image pattern recognition",
}

The file provides a basic configuration for your project.

Adding tasks

As in all the previous steps, we are going to create a JSON object and POST it using the following API URL /api/task in order to add tasks to a project that you own.

For PYBOSSA all the tasks are JSON objects with a field named info where the owners of the project can add any JSON object that will represent a task for their project. For example, using again the Flickr Person demo project example, we need to create a JSON object that should have the link to the photo that we want to identify:

info = dict (link=photo['link'],
             url=photo['url_m'],
             question='Do you see a human face in this photo?')
data = dict (project_id=project_id,
             state=0,
             info=info,
             calibration=0,
             priority_0=0)
data = json.dumps(data)

Note

‘url_m’ is a pattern to describe the URL to the m medium size of the photo used by Flickr. It can be whatever you want, but as we are using Flickr we use the same patterns for storing the data.

The most important field for the task is the info one. This field will be used to store a JSON object with the required data for the task. As Flickr Person is trying to figure out if there is a human or not in a photo, the provided information is:

  1. the Flickr web page posting the photo, and
  2. the direct URL to the image, the <img src> value.

The info field is a free-form field that can be populated with any structure. If your project needs more fields, you can add them and use the format that best fits your needs.

These steps are usually coded in the Task Creator. The Flickr Person Finder projects provides a template for the Task Creator that can be re-used without any problems.

Note

The API request has to be authenticated and authorized. You can get an API-KEY creating an account in the server, and checking the API-KEY created for your user, check the profile account (click in your user name) and copy the field API-KEY.

This API-KEY should be passed as a POST argument like this with the previous data:

[POST] http://domain/api/task/?api_key=API-KEY

One of the benefits of using the API is that you can create tasks polling other web services like Flickr, where you can basically use an API. Once we have created the tasks, we will need to create the Task Presenter for the project.

Creating the Task Presenter

The Task Presenter is usually a template of HTML and JavaScript that will present the tasks to the users, and save the answers in the database. The Flickr Person demo project provides a simple template which has a <div> to load the input files, in this case the photo, and another <div> to load the action buttons that the users will be able to to press to answer the question and save it in the database. Please, check the Project Tutorial for more details about the Task Presenter.

As we will be using the API for creating the task presenter, we will basically have to create an HTML file in our computer, read it from a script, and post it into PYBOSSA using the API.

Once the presenter has been posted to the project, you can edit it locally with your own editor, or using the PYBOSSA interface (see previous section).

Note

The API request has to be authenticated and authorized. You can get an API-KEY creating an account in the server, and checking the API-KEY created for your user, check the profile account (click in your user name) and copy the field API-KEY.

This API-KEY should be passed as a POST argument like this with the previous data:

[POST] http://domain/api/project/?api_key=API-KEY

We recommend to read the Step by step tutorial on creating a Project, as you will understand how to create the task presenter, which is basically adding some HTML skeleton to load the task data, input fields to get the answer of the users, and some JavaScript to make it work.

Using PYBOSSA API from the command line

While you can use your own programming language to access the API we recommend you to use the PYBOSSA pbs command line tool as it simpflies the usage of PYBOSSA for any given project.

Creating a project is as simple as creating a project.json file and then run the following command:

pbs --server server --api-key yourkey create_project

Please, read the section pbs for more details.