다음은 통계학의 몇 가지 개념들을 일반인들도 - TopicsExpress



          

다음은 통계학의 몇 가지 개념들을 일반인들도 알기 쉽게 춤으로 설명한 영상과 그 자막을 정리한 것이다. 통계는 모든 분야의 연구에서 사용하는 유용한 도구인 만큼, 관심있는 이들이 이 영상을 본다면 1. 모집단으로부터 표본을 어떻게 추출하며, 이 때 표준 오차가 어떻게 발생하는가. 2. 두 변수간의 상관관계란 무엇이고, 그것이 인과관계와는 어떻게 다른가. 3. 분산이란 무엇이고 그것이 어떨 때 크게 혹은 작게 나타나는가. 4. 초기의 무질서한 자료를 도수에 따라 정리하여 나타낸 분포에서 어떠한 특징적인 모양을 발견할 수 있으며, 특히 그 중 가장 대표적으로 정규 분포는 어떠한 형태를 보이는가. 에 대하여 시각적으로 재미있고 쉽게 이해할 수 있으리라 예상된다. ------------------------------------------------------------------------------------------------ 1.sampling & standard error(표본 추출과 표준 오차) Dancing statistics: explaining the statistical concept of sampling & standard error through dance Often we want to know something about a large group (a population) but we cant collect information from the whole group. So we collect it from a smaller representative group(a sample). We use what we observe in the sample to estimate what is going on in the population. The blue dancer is the population. He will create a shape. Each of the remaining dancers is a sample drawn from the population. Each sample estimates the population shape. What do you notice about the sample shapes compared to the population shape? The samples produced very good estimates because their shapes closely matched those of the population. However, each sample made a slightly different estimate of the shape. This is known as sampling variation. Different samples will produce different estimates of the population. Lets have a look at some different samples. What do you notice about the sample shapes compared to the population? These samples were not as good at estimating the population because their shapes were quite different to that of the population. Also, there was more variation in the shapes the samples made - these samples had greater sampling variation than before. Whenever we use samples to estimate something in the population, the estimate can be accurate... or not... A small standard error tells us that samples are likely to produce estimates close to the population value (i.e. accurate). A large standard error means that sample estimates could be quite different to the population value(i.e., some will be inaccurate) *Summary* - We use samples to estimate whats going on in a population. - Different samples will produce different estimates. - This is called sampling variation. - The standard error tells us how close or far from the population value sample estimates might fall. ------------------------------------------------------------------------------------------------ 2.correlation(상관관계) Dancing statistics: explaining the statistical concept of correlation through dance Sometimes its useful to look at whether there is relationship between two variables. In this dance, the red dancers another variable. Look at the first dance, what do you notice about the two variables? The changes in movements in the two variables were synchronous, as one variable changed the other changed in the same way. When changes in one variable correspond with similar changes in another variable you have a positive correlation. This is represented by a correlation coefficient, r, that has a positive value up to a maximum of 1. What do you notice about the timing of the movements in the two variables? The changes happened at the same time but one didnt cause the other to change. A correlation does not imply causation. It simply measures changes in variables that co-occur. Is there a relationship between the red and black variables in this dance? The black dancers movements didnt correspond at all with those of the red dancers. This shows a correlation of zero. The changes in one variable bear no relation to the changes in the other variable. Watch this final dance, is there a relationship between the variables? The black dancers movements corresponded to those of the red dancers in opposite ways(fast vs. slow, heavy vs. light and reflected movements). When changes in one variable correspond with opposite changes in another you have a negative correlation. This is represented by a correlation coefficient, r, that has a nagative value to a minimum of -1. *Summary* - The correlation coefficient quantifies the relationship between two variables. - It can be positive, negative or zero(i.e., no relationship at all) - Just because two thing are correlated does not mean that one caused the other. - The size of the correlation coefficient tells us the strength of the relationship between two variables. ------------------------------------------------------------------------------------------------ 3.variance(분산) Dancing statistics: explaining the statistical concept of variance through dance In the following dances, imagine that you want to assign a score to each dancer to indicate the size of their movements. Compare the size of each dancers movements to the others. The dancerss movements (scores) were very similar in size, but they werent exactly the same. The variation is called variance. In this case variance was small. Compare the previious dance to the nest one, again look at the size of the movements. Some dancers moved a lot more than others: some made big movements (large scores) and others small movements (small scores). This variation in scores again shows variance but the variance is larger than in the previous dance. Lets compare the dances directly, look at the size of th movements within each dance. We can compare variances in different sets of scores. What do you notice about the variance in the size of movement within each of these dances? In both dances the variance was small (all dancers made similar sized movements) What do you notice about the variance in the size of the movements within these two dances? In both dances the variance was large. When the variances in different sets of scores are roughly the same its called Homogeneity of variance Homogeneity sounds scary, but it just means similar What do you notice about the variance in the size of movement in these two dances? In one group the variance was small (similar sized movements), in the other group the variance was large (movements differed in size). When the variance between groups differs its known as Heterogeneity of variance. Heterogeneity - its another scary word. Statisticians like scary words. Remember that heterogeneity just means different. *Summary* - If scores in the data set have different values they vary. - This variancecan be big or small(or somewhere in between). - When different sets of scores vary by a similar amount we call it Homogeneity of variance. - When the variance in different sets of scores is different we call it Heterogeneiry of variance. ------------------------------------------------------------------------------------------------ 4.frequency distributions(도수 분포) Dancing statistics: explaining the concept of frequency distributions through dance Often data you collect look like chaos. Staring at them makes your brain hurt. But there are patterns in the data, if you know how to look for them. Look at this dance, and see how order emerges from the initial chaos. How have the dancers arranged themselves? The dancers arranged themselves according to how fast they were dancing. The slowest dancer went to the left, the fastest to the right, the others ordered thamselves between these extremes. Dancers going at the same speed stood behind each other. They created a frequency distribution. The horizontal shows the dancers speed. The vertical shows the number of dancers at each speed(the frequency). Some frequency distributions approximate common shapes. Our dancers approximated a normal or Gaussian distribution. Note that very few dancers lie at the extremes(i.e., dancing really slowly, or dancing like a maniac) and most dancers moved at a speed mid-way between the extremes. This created a characteristic bell-shaped curve. *Summary* - Visualising your data can help you to see order in what initially seems like chaos. - Frequency distributions display each score against how frequently it occurred in the data. - These distributions can have characteristic shapes. - The normal distribution is one such shape: it has a bell shaped curve showing that most scores cluster around the central score with extreme scores occurring less frequently. ------------------------------------------------------------------------------------------------ 1.sampling & standard error(표본 추출과 표준 오차) youtu.be/5fGu8hvdZ6s 2.correlation(상관관계) youtu.be/VFjaBh12C6s 3.variance(분산) youtu.be/pGfwj4GrUlA 4.frequency distributions(도수 분포) youtu.be/orLSv0g9-lk
Posted on: Sat, 16 Nov 2013 17:51:15 +0000

Trending Topics



Recently Viewed Topics




© 2015