Can someone explain more about cases of regression that are not based on a time series? It seems to me that all regression is time series based.
My research on regression suggests regression is a way to determine a relationship between two variables. It is called regression because there is a regressor which aims to predict a regressand. In other fields of mathematics, regressor is also known as a feature, independent variable or explanatory variable. So just like in other fields of maths, in statistics, you aim to determine a relationship which will hopefully be able to predict some sort of phenomenon i.e. you’re looking for a formula which will give you an ability to say, I expect the result to be y when x is ‘insert number here’. My understanding is that in other fields, you have a much ‘cleaner’ relationship than in statistics yielding a much more obviously perfect relationship/function thereby not requiring all the additional statistical massaging. In statistics, due to randomness injected into the system from various sources, you get a scatter plot which then needs to be manipulated and massaged into a simpler function.
To address your question about whether regression analysis only works with time series data, then I think the answer is yes. You have effectively a few choices (correct me if I’m wrong), you can either look at data at a point in time or across time. In other words, you can try finding a relationship between observations of two random variables occurring over a period of time or observations that happened at one point in time. But I’m not 100% sure about the latter. I can’t think of an example where you would be looking for a relationship of variables at one point in time. If someone could provide an example, that would be great.
Examples of non-time series:
Predicting percent correct on q-bank (Y) as a function of some Xs (time studying, prep provider, prior finance degree…)
Modeling the probability a companies earnings exceeds some amount (Y) as a function of some X (new CEO in the last 2 years as a yes/no, employee turn over, trading volume year prior…)
Predicting the probability you will have a stroke or heart attack (Y) on the basis of Xs (your age, gender, total cholesterol, good cholesterol…)
Predicting the exam score for a student (Y) based on Xs (number of hours spent studying, GPA before starting the course, SAT score,…)
predicting the selling price of a house (Y) based on Xs (initial listing price, square feet, attached garage y/n, lot size, median home value in neighborhood, days on market…)
There really are infinite examples of non-time series regression. Hopefully this is helpful.
This is not accurate as there are more than plenty of instances some kind of regression is used on cross-sectional/non time series data.
I think I made a mistake with my definitions. I was trying to say that cross sectional data, being at a point in time is time series data just that the point in time at which the data was taken makes it time-series, but I think your definition is better.
Maybe I’m not understanding you, but cross-sectional data are not time series data. A time series is a temporally ordered set of measurements of some variable Xit such that variable Xi measured at time=t inherently precedes the measurement of Xi at time t+1. Cross-sectional data are inherently at a single, defined time such that there is no way to order temporally the measurements of Xi .
Think of a time series of annual earnings for company A over 15 years, these are inherently ordered.
Compare this to a cross sectional set of earnings from 15 companies in the year 2000: there is no temporal ordering that exists in these data.
Now you might say, “What if I take cross-sections of the same 15 companies at 5 year intervals?” In other words, if we sample 15 companies annual earnings in 2000, 2005… and it’s the same 15 companies, what is this called? This is often referred to in econometrics as “Panel data” but more generally in statistics as a repeated-measures design or longitudinal data. The CFA Institute doesn’t teach you how to even think about this scenario (and arguably, not really the others, but they claim they do). As you might notice here, a key assumption of independent observations of the DV (annual earnings of each company over time) is not tenable because company A’s earnings is likely associated with it’s other earnings in the series and possibly in a different way than how company B’s related to it’s own.
Yes, you understood my misunderstanding perfectly. I was calling cross sectional data as time series data even though it’s not. I just said since it happens at a single point in time, that it’s a type of time series data.
Is the bold text part of your misunderstanding? Because cross-sectional data are not a type of time series because you’ve removed any chance at a temporal ordering. I am unsure if you’re now stating that as a correct remark or that was your misunderstanding.