学堂在线数据科学理论与应用期末考试答案
您已经看过
[清空]
    fa-home|fa-star-o
    当前位置:网课宝盒>学堂在线答案>学堂在线数据科学理论与应用期末考试答案

    学堂在线数据科学理论与应用期末考试答案

    数据科学理论与应用 - 南京大学 - 学堂在线

    1.单选题 (3分)

    Which of the following areas of knowledge is NOT required of data scientists?

    AComputer science and information technology

    BMath and statistics

    CBusiness knowledge

    DBiomedical Engineering

    正确答案:D

    2.单选题 (3分)

    Which of the following is Not a basic analytical approach to data science?

    ARegression

    BClassification

    CDescriptive Statistics

    DCluster

    正确答案:C

    3.单选题 (3分)

    What is the main difference between supervised learning and unsupervised learning?

    ASupervised learning is done using ground truth

    BUnsupervised learning does not have labeled outputs

    CSupervised learning aims to learn a function

    DUnsupervised learning can infer the natural structure present in a set of data points

    正确答案:A

    4.单选题 (3分)

    How many columns of a 10*10 long data type table will be converted to a wide data type for a column?

    A17

    B18

    C19

    D20

    正确答案:B

    5.单选题 (3分)

    Which of the following descriptions of the weighted mean is incorrect.

    Acalculated by multiplying the weight (or probability)

    Bassociated with a particular event or outcome with its associated quantitative outcome

    Cvery useful when calculating a theoretically expected outcome

    Deach outcome has a different probability of occurring

    正确答案:D

    6.单选题 (3分)

    The univariate visualizations don't include______.

    Aboxplot

    Bhistogram

    Cline chart

    Ddensity estimate

    正确答案:C

    7.单选题 (3分)

    What data type does the code "a=as.vector(c(list('a',1),list('afo',222)))" will assign to a?

    Avector

    Blist

    CNULL

    DThe code will be Error

    正确答案:B

    8.单选题 (3分)

    What output does the code "as.integer(as.factor(c(0,1)))" will have?

    A[1] 0 1

    B[1] 1 0

    C[1] 1 2

    D[1] 2 1

    正确答案:C

    9.单选题 (3分)

    Which of the following statements about Tidy Data is incorrect

    Aevery column is variable

    Bevery row is an observation

    Cevery cell is a single numerical value

    DAll the above descriptions about Tidy Data are correct

    正确答案:C

    10.单选题 (3分)

    For splitting the data by one or two categorical variables, what is most suitable for us?

    Atheme()

    Bgeom_bar()

    Cfacet_grid()

    Dfacet_wrap()

    正确答案:C

    11.单选题 (3分)

    Which layer can provide a new perspective of data interpretation for visual analysis?

    AThe facets layer 

    BThe theme layer

    CThe coordinate layer

    DThe statistics layer

    正确答案:C

    12.单选题 (3分)

    If we want to change individual elements, such as the background color or font of our title, what functions can we use?

    Ageom_bar()

    Btheme()

    Cfacet_grid()

    DFacet_wrap()

    正确答案:B

    13.单选题 (3分)

    In regression analysis, there are _____ main hypothesis tests.

    Aone

    Btwo

    Cthree

    Dfour

    正确答案:D

    14.单选题 (3分)

    For example, the significance level is 0.05; the corresponding confidence level is( ) . 

    A93% 

    B94% 

    C95% 

    D96%

    正确答案:C

    15.单选题 (3分)

    Which of the following code can present the result of regression?

    AAnova()

    BSummary()

    CConfint()

    DPredict()

    正确答案:B

    16.单选题 (3分)

    Which of the following algorithms is not a decision tree algorithm?

    AID3

    BYolo v5

    CCART

    DC4.5

    正确答案:B

    17.单选题 (3分)

    Which of the following algorithms is not the example of an eager learner?

    AK-Nearest Neighbors

    BLogistic regression

    CDecision tree

    DNaive bayes

    正确答案:A

    18.单选题 (3分)

    The attribute selection measure used by CART is ______.

    Ainformation gain

    Binformation gain ratio

    Cbasic information entropy

    Dgini Index

    正确答案:D

    19.单选题 (3分)

    In which type of clustering, do you need to use the concept of dendrogram?

    APrototype-based clustering

    BDensity-based clustering

    CHierarchical clustering

    DPartitioning clustering

    正确答案:C

    20.单选题 (3分)

    Which strategy or algorithm below belongs to Hierarchical clustering?

    AAGNES

    BK-means

    CDBSCAN

    DSMC

    正确答案:A

    21.单选题 (3分)

    What should be alerted when you use a collaborative filtering strategy?

    AIt determines the features of items that can be used to measure their similarity.

    BIt could be useless at the beginning since the records you have are not enough.

    CIt won't recommend an item that hasn't been bought before.

    DThe "over-specialization" problem still exists.

    正确答案:B

    22.多选题 (4分)

    What skills do data scientists need to use to deal with data?

    AThe machine learning algorithms

    BThe knowledge of programming languages

    CProcessing of financial statements

    DData visualization knowledge

    正确答案:A,B,D (少选不得分)

    23.多选题 (4分)

    In general, histograms are plotted such that

    Aempty bins are included in the graph.

    Bbins are equal in width.

    Cthe number of bins is up to the user.

    Dbars are contiguous. That is, no empty space shows between bars unless there is an empty bin

    正确答案:A,B,C,D (少选不得分)

    24.多选题 (4分)

    The common problems we can find with raw data can be______.

    Anamely missing data

    Bnoisy data

    Cunstructured data

    Dinconsistent data

    正确答案:A,B,D (少选不得分)

    25.多选题 (4分)

    Which belong to auxiliary layers of the ggplot2 package?

    AData

    BFacets

    CStatistics

    DGeometries

    正确答案:B,C (少选不得分)

    26.多选题 (4分)

    Which belongs to the classical OLS assumptions for linear regression?( )

    Athe regression model is linear in the coefficients and the error term.

    Ball independent variables are uncorrelated with the error term.

    Cthe error term has a constant variance.

    Dthe error term is normally distributed.

    正确答案:A,B,C,D (少选不得分)

    27.多选题 (4分)

    Which of the following are the advantages of the decision tree algorithms?

    AHard to overfit

    BDifferent attribute division methods have different preferences for attribute selection

    CAbility to fit data with irrelevant features and missing value

    DEasy to understand, explain and visually analyze

    正确答案:C,D (少选不得分)

    28.多选题 (4分)

    User-based and item-based filtering have different performances in different situations. Which choices below are correct?

    AUser-based filtering is more suitable for time-sensitive items like news.

    BItem-based filtering is more suitable when items are simple and relatively stable.

    CUser-based filtering is more suitable when the number of users is more significant than the items.

    DItem-based filtering is more suitable for tailoring to personal taste.

    正确答案:A,B,D (少选不得分)

    29.判断题 (1分)

    Raw data is the original data provided by the users or collected through some techniques, such as crawlers.

    正确答案:错误

    30.判断题 (1分)

    We can only talk about the correlation between the two variables.

    正确答案:错误

    31.判断题 (1分)

    If an analysis requires data preprocessing, it must be done before data analysis.

    正确答案:正确

    32.判断题 (1分)

    When we create a plot skeleton, we first need to think about how to map the data variables to the aesthetics in the graph.

    正确答案:错误

    33.判断题 (1分)

    Hypothesis testing helps you prove if your data is statistically significant and unlikely to have occurred by chance alone.

    正确答案:正确

    34.判断题 (1分)

    To address this concern, nearest-neighbor methods often use weighted voting or similarity moderated voting such that each neighbor's contribution is scaled by its similarity.

    正确答案:正确

    35.判断题 (1分)

    In hierarchical clustering, you can choose the number of clusters depending on the dendrogram it produces, and can always turn back after making the wrong decision.

    正确答案:错误

    36.判断题 (1分)

    We can use correlation analysis to predict a driver's travel time by using miles traveled and number of deliveries.( )

    正确答案:错误

    37.判断题 (1分)

    In the narrow sense, a data science product is a product facilitated with a particular data science technique.

    正确答案:正确


    学堂在线数据科学理论与应用期末考试答案》由《网课宝盒》整理呈现,请在转载分享时带上本文链接,谢谢!

    电大答案

    支持Ctrl+Enter提交
    网课宝盒 © All Rights Reserved.  联系我们:QQ 997755178
    蜀ICP备18035410号-3|网站标签|站点地图|

    当前文章名称

    手机号用于查询订单,请认真核对

    支付宝
    立即支付

    请输入手机号或商家订单号

    商家订单号在哪里?点此了解

    你输入的数据有误,请确认!

    如已购买,但查不到

    可联系客服QQ 55089918 进行核实