Python Data Science Online Test
About the test
The Python Data Science online test assesses knowledge of using Python and data science libraries such as Pandas, NumPy, Scipy, and Scikit-learn to analyze data through a series of live coding questions. This test requires applying probability and statistics to solve data science problems.
The assessment includes work-sample tasks such as:
- Classification of data using different algorithms.
- Aggregating, grouping, sorting, and cleaning data.
- Building machine learning models.
A good data scientist or data analyst using Python for their tasks should be able to take advantage of the functionality provided by Python data science libraries to extract and analyze knowledge and insights from data.
Sample public questions
A company stores login data and passwords in two different containers:
- DataFrame with columns: Id, Login, Verified.
- Two-dimensional NumPy array where each element is an array that contains: Id and Password.
Elements on the same row/index have the same Id.
Implement the function login_table that accepts these two containers and modifies id_name_verified DataFrame in-place, so that:
- The Verified column should be removed.
- The password from NumPy array should be added as the last column with the name "Password" to DataFrame.
For example, the following code snippet:
id_name_verified = pd.DataFrame([[1, "JohnDoe", True], [2, "AnnFranklin", False]], columns=["Id", "Login", "Verified"]) id_password = np.array([[1, 987340123], [2, 187031122]], np.int32) login_table(id_name_verified, id_password) print(id_name_verified)
Id Login Password 0 1 JohnDoe 987340123 1 2 AnnFranklin 187031122
As a part of an application for iris enthusiasts, implement the train_and_predict function which should be able to classify three types of irises based on four features.
The train_and_predict function accepts three parameters:
- train_input_features - a two-dimensional NumPy array where each element is an array that contains: sepal length, sepal width, petal length, and petal width.
- train_outputs - a one-dimensional NumPy array where each element is a number representing the species of iris which is described in the same row of train_input_features. 0 represents Iris setosa, 1 represents Iris versicolor, and 2 represents Iris virginica.
- prediction_features - two-dimensional NumPy array where each element is an array that contains: sepal length, sepal width, petal length, and petal width.
The function should train a classifier using train_input_features as input data and train_outputs as the expected result. After that, the function should use the trained classifier to predict labels for prediction_features and return them as an iterable (like list or numpy.ndarray). The nth position in the result should be the classification of the nth row of the prediction_features parameter.
Implement the desired_marketing_expenditure function, which returns the required amount of money that needs to be invested in a new marketing campaign to sell the desired number of units.
Use the data from previous marketing campaigns to evaluate how the number of units sold grows linearly as the amount of money invested increases.
For example, for the desired number of 60,000 units sold and previous campaign data from the table below, the function should return the float 250,000.
|Campaign||Marketing expenditure||Units sold|
You are given a list of tickers and their daily closing prices for a given period.
Implement the most_corr function that, when given each ticker's daily closing prices, returns the pair of tickers that are the most highly (linearly) correlated by daily percentage change.
For companies: premium questions
Buy TestDome to access premium questions that can't be practiced.
Get money back if you find any premium question answered online.
8 more premium Python Data Science questions
Skills and topics tested
- Python for Data Science
- Cauchy Distribution
- Exponential Distribution
- Normal Distribution
- Data Cleaning
- Machine Learning
- Nonlinear Regression
- Processing CSV
- Data Aggregation
- K-Nearest Neighbors
For job roles
- Data Analyst
- Data Scientist
Sample candidate report
What others say
Simple, straight-forward technical testing
TestDome is simple, provides a reasonable (though not extensive) battery of tests to choose from, and doesn't take the candidate an inordinate amount of time. It also simulates working pressure with the time limits.
Jan Opperman, Grindrod Bank
Solve all your skill testing needs
150+ Pre-made tests
From web development and database administration to project management and customer support. See all pre-made tests.
Mix questions for different skills or even custom questions in one test. See an example.
How TestDome works
Choose a pre-made test
or create a custom test
Invite candidates via
email, URL, or your ATS
a test remotely
Sort candidates and
get individual reports