gaokao 2020 Q18

gaokao · China · national-II-science 12 marks Linear regression
After treatment, the ecosystem in a desert region has improved significantly, and the number of wild animals has increased. To investigate the population of a certain wild animal species in this region, the area is divided into 200 plots of similar size. A simple random sample of 20 plots is selected as sample areas. The sample data obtained is $\left( x _ { i } , y _ { i } \right) ( i = 1,2 , \cdots , 20 )$ , where $x _ { i }$ and $y _ { i }$ represent the plant coverage area (in hectares) and the number of this wild animal species in the $i$-th sample area, respectively. The following calculations are obtained: $\sum _ { i = 1 } ^ { 20 } x _ { i } = 60 , \sum _ { i = 1 } ^ { 20 } y _ { i } = 1200 , \sum _ { i = 1 } ^ { 20 } \left( x _ { i } - \bar { x } \right) ^ { 2 } = 80 , \sum _ { i = 1 } ^ { 20 } \left( y _ { i } - \bar { y } \right) ^ { 2 } = 9000 , \sum _ { i = 1 } ^ { 20 } \left( x _ { i } - \bar { x } \right) \left( y _ { i } - \bar { y } \right) = 800$ .
(1) Find the estimated value of the population of this wild animal species in the region (the estimated value equals the average number of this wild animal species in the sample areas multiplied by the number of plots);
(2) Find the correlation coefficient of the sample $\left( x _ { i } , y _ { i } \right) ( i = 1,2 , \cdots , 20 )$ (accurate to 0.01);
(3) Based on current statistical data, there is great variation in plant coverage area among different plots. To improve the representativeness of the sample and obtain a more accurate estimate of the population of this wild animal species in the region, please suggest a more reasonable sampling method and explain your reasoning.
After treatment, the ecosystem in a desert region has improved significantly, and the number of wild animals has increased. To investigate the population of a certain wild animal species in this region, the area is divided into 200 plots of similar size. A simple random sample of 20 plots is selected as sample areas. The sample data obtained is $\left( x _ { i } , y _ { i } \right) ( i = 1,2 , \cdots , 20 )$ , where $x _ { i }$ and $y _ { i }$ represent the plant coverage area (in hectares) and the number of this wild animal species in the $i$-th sample area, respectively. The following calculations are obtained: $\sum _ { i = 1 } ^ { 20 } x _ { i } = 60 , \sum _ { i = 1 } ^ { 20 } y _ { i } = 1200 , \sum _ { i = 1 } ^ { 20 } \left( x _ { i } - \bar { x } \right) ^ { 2 } = 80 , \sum _ { i = 1 } ^ { 20 } \left( y _ { i } - \bar { y } \right) ^ { 2 } = 9000 , \sum _ { i = 1 } ^ { 20 } \left( x _ { i } - \bar { x } \right) \left( y _ { i } - \bar { y } \right) = 800$ .

(1) Find the estimated value of the population of this wild animal species in the region (the estimated value equals the average number of this wild animal species in the sample areas multiplied by the number of plots);

(2) Find the correlation coefficient of the sample $\left( x _ { i } , y _ { i } \right) ( i = 1,2 , \cdots , 20 )$ (accurate to 0.01);

(3) Based on current statistical data, there is great variation in plant coverage area among different plots. To improve the representativeness of the sample and obtain a more accurate estimate of the population of this wild animal species in the region, please suggest a more reasonable sampling method and explain your reasoning.