Science topics: Mathematical Sciences Statistics Statistical Analysis Time Series Analysis Multiple Time Series Analysis

Science topic

Multiple Time Series Analysis - Science topic

Explore the latest questions and answers in Multiple Time Series Analysis, and find Multiple Time Series Analysis experts.

Randimal Herath

asked a question related to Multiple Time Series Analysis

What imputation methods can use when whole records are missing in multiple time series?

Question

4 answers

Feb 2, 2023

I'm using multiple time series measured daily from 2015 to 2021. but records for some days are missing for all time series. How can I impute the records for missing days?

Relevant answer

Christian Geiser

Feb 3, 2023

Answer

I disagree with Dinesh Kumar. Under the assumptions of missing completely at random (MCAR) or missing at random (MAR) data, full information maximum likelihood (FIML) or multiple imputation may be used. Under missing not at random (MNAR) data, these techniques can lead to bias. Also, mean imputation or LOCF are no longer recommended in the missing data literature (e.g., Enders, 2022). Mean imputation can lower the variance and LOCF can obviously introduce bias in longitudinal data.

Enders, C. K. (2022). Applied missing data analysis. Guilford Publications.

View

12 Recommendations

Harsh Upadhyay

asked a question related to Multiple Time Series Analysis

How are multiple zero values treated in trend analysis?

Question

8 answers

Dec 22, 2021

I am using Mann-Kendall test and Sen slope to assess the trends in monthly rainfall datasets for 64 years, e.g., Jan 1957, Jan 1958, ..., Jan 2020. Since the region is a semi arid one, there are a lot of zero values (NOT missing values) in the time series. For example, the time series for rainfall in January has only 15 non-zero values out of 64 data points. My question is how this will effect the trend test (Mann-Kendall) and the trend slope (Theil-Sen)?

Relevant answer

Abolfazl Ghoodjani

Dec 23, 2021

Answer

Hello Harsh Upadhyay

My suggestion is to delete the months that have zero data. Then analyze the remaining data (15 months).

These tests are sensitive to zero numbers and strongly affect the results.

You must have at least 10 observations for the Normal approximation to be appropriate.

View

7 Recommendations

Aron Tenke

asked a question related to Multiple Time Series Analysis

How much is too many variables in a VECM model?

Question

4 answers

Sep 28, 2021

TLDR: How many variables can I have in a VAR or VECM model?

I am writing my thesis and I am using a VECM (VAR model with error correction for cointegration) model for analyzing the relationship between the prices of an energy exchange and some other factors. So far I have 4 variables in my model and I am thinking of adding more.

My question is that after how many variables does the model become unusable and unstable or can I add as much as I like?

Thank you for your answers in advance!

Relevant answer

Francis W Ahking

Oct 1, 2021

Answer

If you are not constrained by degrees of freedom, then you should be guided by the theory. Include all the relevant variables suggested by theory, that should constitute the minimum set of variables that you should include. Should you add more variables after that? That depends. What is the justification for those additional variables? You should be able to provide a justification for those added variables. You should also avoid a "kitchen sink" approach.

View

0 Recommendations

Ibrahim Niftiyev

asked a question related to Multiple Time Series Analysis

CUSUMSQ issue in OLS estimations?

Question

17 answers

Mar 4, 2021

Dear Colleagues,

I estimated the OLS models and checked them for several tests; however, instability in CUSUMSQ persists as described in the photo. What should I do in this case?

Best

Ibrahim

cusums
q.png
11.69 KB

Relevant answer

John C Frain

Mar 10, 2021

Answer

I presume that your data is quarterly or monthly as otherwise, you have too few observations to make any reasonable inferences.

If you are trying to make causal inferences (e.g. you have an economic model that implies that x causes y and you wish to measure that effect). the CUMSUMSQ is one test that indicates that your model is not stable. Either the coefficients or the variance of the residuals is not stable. You have indicated that there is no heteroskedasticity so it is possible that the model coefficients are the problem. The test itself only indicates that there's instability and does not say what the instability is or what causes it. There are many possible causes of instability, (omitted variables, functional form, heteroskedasticity, autocorrelation, varying coefficients etc.) Your best procedure is to return to your economics and work out how your theory might lead to stability problems. Are there possible breaks in your data caused by policy changes, strikes, technological innovations, and similar. that might be covered with a dummy variable or a step dummy.

If you are doing forecasting (or projections) I would not be too concerned about specification tests. It is very unlikely that an unstable model will forecast well. You may achieve good forecasting results with a very simple model that need not be fully theory compliant.

View

39 Recommendations

Loughlin Dudley

asked a question related to Multiple Time Series Analysis

How to interpolate from a time-series database ?

Question

3 answers

Feb 25, 2021

Hi,

I am having trouble with a problem, in the field of Optimal Control and the generation of optimal time-series.

Let's consider a system, whose dynamics are represented by dx/dt = f(t,x(t),u(t),p(t)), x and u being respectively the state and control vectors for the system. p is a vector of parameters which have a direct influence on a system's dynamics.

An example illustrating this would be considering a drone, going from point A to point B, in minimum time, but subject to a windy environment (the wind being represented by the time-dependent variable p(t)).

I have generated, by solving an Optimal Control Problem, optimal time-series for x(t) and u(t), for several values of p=p(t)=constant.

I would now like to interpolate, for any given value of p(t) at time t, the "nearly-optimal" control u(t) to be applied to the system between time t and time t+1, based on the OCP results previously computed.

Would you know if this is even possible ? I have not really been able to find published work on this topic, if you had any suggestions I would be grateful.

Thanks,

Relevant answer

Loughlin Dudley

Mar 2, 2021

Answer

Hi,

Thanks for your reply. Sadly, the disturbance is does not appear linearly in the state equation, it directly influences the system dynamics.

View

0 Recommendations

Ibrahim Niftiyev

asked a question related to Multiple Time Series Analysis

Making the data stationary?

Question

19 answers

Feb 22, 2021

Dear Colleagues,

If I have 10 variables in my dataset (time series) out of which 9 is explanatory and 1 dependent, and if I clarify that all the variables are non-stationary, should I take the first difference of the dependent variable as well?

Best

Ibrahim

Relevant answer

Elvis Munyaradzi Ganyaupfu

Feb 24, 2021

Answer

Dear Ibrahim Niftiyev ,

Econometric models estimated with non-stationary data are profoundly invalid and misleading (Greene, 2002). An example of a simple scenario: - in a regression with one regressor, there are three variables that could be stationary or non-stationary; namely the dependent variable (Y), the regressor (X), and the disturbance term (u). A suitable econometric treatment of such a model depends critically on the pattern of stationarity and non-stationarity of these three variables (incl. the dependant variable). Since quite often variables can be non-stationary at I(0), it is important to understand the forces behind such non-stationarity, which largely include structural breaks, deterministic trends, and stochastic trends. Differencing (including the explained variable as in your case) is a common appropriate in nonstationary models, and this is often correct (Granger & Newbold, 1974; Green, 2002; and Stock, James & Watson, 2011).

Granger, C. W. J., and Paul Newbold. 1974. Spurious Regressions in Econometrics. Journal of Econometrics, 2(2):111-120.

Greene, W. 2002. Time series Models. (pp. 608-662) In Econometric Analysis, 5th edition. Prentice Hall, Upper Saddle Rive, NJ.

Stock, James H., and Mark Watson. 2011. Introduction to Econometrics. 3rd ed. Boston: Pearson Education/Addison Wesley.

View

22 Recommendations

Ibrahim Niftiyev

asked a question related to Multiple Time Series Analysis

Any recommendations about from where to learn about how to study non-linear relationships?

Question

7 answers

Nov 9, 2020

Dear colleagues,

I am capable of using linear estimations between X and Y variables via OLS or 2SLS (on Eviews, for example); however, I need to study how to estimate/model non-linear relationships as well. If you know any source which can explain it in a simple language based on time series, your recommendations are well-welcomed. Thank you beforehand.

Best

Ibrahim

Relevant answer

Kehinde Mary Bello

Nov 13, 2020

Answer

Dear Ibrahim,

I also recommend Greg. N Gregorios and Raven Pascalau @Hamid Muili and google for more materials on non-linear relationship.

Regards

View

12 Recommendations

S.C Thushara

asked a question related to Multiple Time Series Analysis

Quasi correlation in DCC-GARCH in STATA?

Question

1 answer

Jan 15, 2019

Hi,

When estimated the DCC-GARCH in stata at the end of the output pairwise quasi correlations are given. What does it mean in practice? is it the mean value of dynamic correlations or something else?

Much appreciated if anybody could clarify this.

Kind regards

Thushara

Relevant answer

Roque Pinto de Camargo Neto

Feb 26, 2019

Answer

Dear SC Thushara,

I'm not sure about your question, but I think I've read about it in:

https://pdfs.semanticscholar.org/b1cd/c8da3b994606ba4f96b8e0577e26f8dccd29.pdf?_ga=2.30146879.1241071524.1551216901-578234132.1551216901

View

0 Recommendations

S.C Thushara

asked a question related to Multiple Time Series Analysis

DCC_GARCH Conditional correlation ?

Question

2 answers

Jan 10, 2019

I've estimated a DCC-GARCH(1,1) model using STATA. at the end of the stata output, correlation matrix is given and it is also called quasi correlation matrix. Is it the conditional correlation matrix or a different one? if so is it the average/mean value of the dynamic conditional correlations?

Much appreciated if anybody clarifies this.

(I've herewith attached the output)

Kind regards

Thushara

dcc_garch_r
g.docx
88.99 KB

Relevant answer

Guy Mélard

Jan 16, 2019

Answer

Hi. The answer is in the Stata documentation : "When Qt is stationary, the R matrix in (1) is a weighted average of the unconditional covariance matrix of the standardized residuals et, denoted by R, and the unconditional mean of Qt, denoted by Q. Because R is not eaqual to Q, as shown by Aielli (2009), R is neither the unconditional correlation matrix nor the unconditional mean of Qt. For this reason, the parameters in R are known as quasicorrelations; see Aielli (2009) and Engle (2009) for discussions". Type in your favorite search engine "DCC-GARCH STATA quasi correlation matrix" and you will find it on page 5. I used Google.

View

14 Recommendations

Vishal Gupta

asked a question related to Multiple Time Series Analysis

Open Source Unsupervised/Semi-supervised Time-Series Anomaly Detection?

Question

5 answers

Sep 6, 2018

What's the best open source (i.e., free) approach/library/tool for unsupervised/semi-supervised[i.e., with limited to no training data] time-series [like this - https://github.com/numenta/nupic/blob/master/src/nupic/datafiles/extra/nycTaxi/nycTaxi.csv] anomaly detection.

Relevant answer

Prabath Perera

Jan 15, 2019

Answer

R and Python

View

0 Recommendations

S.C Thushara

asked a question related to Multiple Time Series Analysis

DCC-GARCH interpretation?

Question

6 answers

Dec 22, 2018

HI,

In a DCC-GARCH(1,1) model (dependent variable is first difference of logarithm of the series) based on monthly data,

1. How do you interpret unconditional and conditional correlation in a DCC-GARCH model?

2. Is it possible to get a correlation matrix (like the unconditional correlation matrix, without correlation for each month and pair) for conditional correlations? or we just need to present the data using a conditional variance graph for each pair?

Much appreciated your comments/advice on this.

Kind regards

Thushara

Relevant answer

Elma Satrovic

Dec 31, 2018

Answer

You may consider this link useful: https://stats.stackexchange.com/questions/192400/interpretation-of-dcc-garch-output

View

60 Recommendations

Jia Jiet Lim

asked a question related to Multiple Time Series Analysis

How to determine which AUC calculation to use (Total AUC, iAUC or net iAUC) for blood biomarkers?

Question

5 answers

Dec 3, 2018

My clinical study measured the blood biomakers (glucose, insulin, glucagon, GLP-1, GIP, amino acids, etc) at baseline before an intervention meal and at multiple time points after an intervention meal.

We did this measure three times using three different intervention meals at three different days. My main objective is to compare whether there is any difference in the change in blood biomarkers after different intervention meals.

There are several AUC calculation methods to do this, such as the total AUC, incremental AUC (ignore the area under baseline) and net incremental AUC (subtract the area under baseline). How to determine which one to use and what is the rationale?

Relevant answer

Seid Reza Falsafi

Dec 5, 2018

Answer

iAUC (incremental ...) is the most recommended and approved method. Follow iAUC

View

8 Recommendations

S.C Thushara

asked a question related to Multiple Time Series Analysis

Huber- white vs Bollerslev-Wooldridge standard errors: which one is better?

Question

3 answers

Sep 7, 2018

I am modelling volatility of international tourists arrivals from several source markets. I use mainly two methods ARIMA-GARCH or ARIMA-GJR models and SARIMA-GARCH or SARIMA-GJR models. Initially the estimates suggest that error terms of some models do not exhibit a normal distribution even though in the estimation I assumed it is normally distributed. I obtained the Bollerslev-Wooldridge standard errors in this case as it is said to be better than ordinary se. As some of the models do not have normally-distributed error term, I re-estimated all the models assuming a student-t distribution as it is recommended when the error term is non-normal. However, Bollerslev-Wooldridge standard errors were not available when using student-t distribution (In Eviews 10 software). Instead Huber-White se is available. I am wondering whether this is better than Bollerslev-Wooldridge standard errors or produce approximately similar outcomes. Any advice is much appreciated !!!

Relevant answer

Daniel Velásquez-Gaviria

Sep 27, 2018

Answer

Hello S.C Thushara

There is no magic formula for errors to be normal identically distributed, in fact with no model you will find that the errors give you white noise, what you must ensure is that they do not have autocorrelation presence that is done with the pormonteu test in the lags and make sure that the squared residuals do not have arch effects with the Ljung-Box test. I recommend using distributions that have skewness and kurtosis to model the innovations of conditional volatility models, for that you must migrate from eviews to R or matlab.

Regards,

View

15 Recommendations

Krzysztof Beck

asked a question related to Multiple Time Series Analysis

Could any one recommend some literature or/and software for multi-level non-hierarchical dynamic factor models?

Question

3 answers

Jun 8, 2018

Hi,

can anyone recommend some literature and/or software for multi-level non-hierarchical dynamic factor models?

Relevant answer

Caitlin Furby

Jun 12, 2018

Answer

Have you created an account on Kagle? There might be something on there

View

0 Recommendations

Juan Hincapie-Castillo

asked a question related to Multiple Time Series Analysis

Has anybody used the Granger causality test for health policy evaluations?

Question

3 answers

Feb 27, 2018

I am trying to compare two time series and I am in the process of assessing different methodologies to compare their relationship.

If you have used the Granger test, are you willing to share some literature on the topic please?

Relevant answer

S.C Thushara

Feb 28, 2018

Answer

Hi,

Granger-Causality is mainly used in economics and finance research and it can be used in other disciplines as well. Therefore, most of the literature are in those two fields. Since you have only two time series, you can go gor pair-wise granger causality tests to assess the nature of causality, i.e whether the causality is uni directional or bi directional if there is causality exists between the two variables.

Please see below for some articles in your field.

https://www.tandfonline.com/doi/abs/10.1080/00036840601019083

https://www.tandfonline.com/doi/abs/10.1080/13504850010017357

https://www.sciencedirect.com/science/article/pii/S0304407602001458

Article Is health capital formation good for long-term economic grow...

http://www.emeraldinsight.com/doi/pdfplus/10.1108/JPBAFM-20-04-2008-B002

https://www.sciencedirect.com/science/article/pii/S0165176512001814

Good luck with your research.

Kind regards

Thushara

View

4 Recommendations

Simona Kralj-Fišer

asked a question related to Multiple Time Series Analysis

Cross-sex genetic correlation

Question

1 answer

Jul 26, 2017

Hi!

We are trying to estimate body mass (W) heritability and cross-sex genetic correlation using MCMCglmm. Our data matrix consists of three columns: ID, sex, and W. Body mass data is NOT normally distributed.

Following previous advice, we first separated weight data into two columns, WF and WM. WF listed weight data for female specimens and “NA” for males, and vice-versa in the WM column. We used the following prior and model combination:

prior1 <- list(R=list(V=diag(2)/2, nu=2), G=list(G1=list(V=diag(2)/2, nu=2)))

modelmulti <- MCMCglmm(cbind(WF,WM)~trait-1, random=~us(trait):animal, rcov=~us(trait):units, prior=prior1, pedigree=Ped, data=Data1, nitt=100000, burnin=10000, thin=10)

The resulting posterior means of posterior distribution were suspiciously low (e.g. 0.00002). We calculated heritability values anyway, using the following:

herit1 <- modelmulti$VCV[,'traitWF:trait WF.animal']/

(modelmulti$VCV[,'traitWF:trai tWF.animal']+modelmulti$VCV[,' traitWF:traitWF.units'])

herit2 <- modelmulti$VCV[,'traitWM:trait WM.animal']/

(modelmulti$VCV[,'traitWM:trai tWM.animal']+modelmulti$VCV[,' traitWM:traitWM.units'])

corr.gen <- modelmulti$VCV[,traitWF.traitW M.animal']/

sqrt(modelmulti$VCV[,'traitWF: traitWF.animal']*modelmulti$VC V[,'traitWM:traitWM.animal'])

We get heritability estimates of about 50%, which is reasonable, but correlation estimates were extremely low, about 0.04%.

Suspecting the model was wrong, we used the original dataset with all weight data in a single column and tried the following model:

prior2 <- list(R=list(V=1, nu=0.02), G=list(G1=list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))

model <- MCMCglmm(W~sex, random=~us(sex):animal, rcov=~us(sex):units, prior=prior2, pedigree=Ped, data=Data1, nitt=100000, burnin=10000, thin=10)

The model runs, but it refuses to calculate “herit” values, with the error message “subscript out of bounds”. We’d also add that in this case, the posterior density graph for sex2:sex.animal is not shaped like a bell.

What are we doing wrong? Are we even using the correct models?

Eva and Simona

Relevant answer

Simona Kralj-Fišer

Dec 11, 2017

Answer

See our published paper on the topic: Cross-sex genetic correlation does not extend to sexual size dimorphism in spiders

View

4 Recommendations

Luciano Aparecido Magrini

asked a question related to Multiple Time Series Analysis

Is there any mathematical model (other than neural dynamics) for chaotic bursts?

Question

5 answers

Sep 21, 2017

Hey, dears!

I am finding for someone mathematical model with chaotic burstings beyond the neural dynamics and don't have sucess.

Particularly, I have interested if (using couplings) it s possible force the Lorentz ou Rossler, for example, for exhibit this behavior.

I would like of receive suggestions of articles or ways for this.

Thank you.

Resgards,

Relevant answer

Mohammad Sajid

Dec 1, 2017

Answer

These article may be useful:

Chaotic burst in the dynamics of fλ (z)=\lambda sinhz/z https://doi.org/10.1070/RD2005v010n01ABEH000301

Article Chaotic Burst in the Dynamics of a Class of Noncritically Fi...

View

6 Recommendations

Tania L F Bird

asked a question related to Multiple Time Series Analysis

RDA_PCNM, AEM or just Euclidean distances?- Multivariate time series analysis - temporal composition variability?

Question

2 answers

Jan 10, 2017

I want to look at temporal variability of community composition..

Ive been reading about methods such as Redundancy analysis with principal coordinates of neighbourhood matrices (RDA-PCNM) or Asymmetric eigenvector maps (AEM) (Boccard & Legendre 2002, Legendre 2014).

Ultimately I want to:

1) plot a graph with X axis = time, Y axis = some 'univariate measure' of composition (I've seen Jaccard, or RDA x-axis score etc)

2) Calculate a 'univariate measure' of the temporal variability of composition (i.e a multivariate measure of Coefficient of Variation - CV) to plot against other x axis such as diversity etc..

My questions are

1) if you use RDA_PCNM or AEM -ie Boccard & Legendre 2002/Legendre 2014 methods; what is the need for conducting the PCNM first? Why cant you just use the RDA scores based on the original data?

2) An output of the RDA_PCNM can be the RDA x axis score plotted over time. But this still only shows you one dimension of the variability.. so you still need to plot the RDA y axis score too -

Isn't there some way to create a single 'measure' to incorporate the multidimensionality of possible composition changes.. such as the Euclidean distances between Time 0 and Time t for each year (perhaps from a PCNM) and plot that as the y-score?

What is the advantage of the RDA_PCNM or AEM methods over the Euclidean distance method?

3) to calculate a compositional measure of temporal variability; can I use the Euclidean distances as above and then calculate CV of these distances?

Thanks for your suggestions.

Data structure:

12 time points (not all sites sampled in all years)

environmental variable = habitat type

3 habitat types with a gradient of vegetation cover from A-C

3 replicate sites in each habitat

multivariate response variable = abundance data across multiple species (community)

Relevant answer

Kuber P Bhatta

Feb 5, 2017

Answer

Many researchers regard PCNM as a suitable technique for transforming spatial distances of a truncated distance-matrix to the matrix (rectangular data) suitable for constrained ordinations such as in RDA. Otherwise, by using the truncated distance-matrix directly into constrained ordination, you are forcing truncated environmental data into regular distance matrix, when it is not actually so.

But for the distances in rectangular form, it is similar to normal explanatory variables used in, e.g., constrained ordination (rda and cca). Therefore, I have directly used the RDA scores of the temporal gradient (pls see attached file).

2) "An output of the RDA_PCNM can be the RDA x axis score plotted over time. But ……"

If you are concerned only about the ‘temporal variation’, why would you need to analyse the variation along other axes (say, along spatial gradient)? In such condition, I would simply go for partial constrained ordination (say pRDA), and treat all the non-temporal gradients as co-variates. In my opinion, it would then be safer to use the RDA scores of only the 'temporal gradient'.

“I am not an expert of such analyses, but have been practicing these techniques for a while…”

jvs12423-sup-0006-Appendix
S6.pdf
94.58 KB

View

0 Recommendations

Bayram Veli Doyar

asked a question related to Multiple Time Series Analysis

Which model should be used to check main assumptions: long-run or ECM?

Question

8 answers

Jul 27, 2016

Using Engle-Granger method, I found cointegrating relationship between my variables. Then I estimated long-run and error-correction models. Which model should I use to check main assumptions like normality, heteroskedasticity etc.?

Relevant answer

Chuck A Arize

Jul 31, 2016

Answer

Thank you, my good friend but don't work too hard!

Regards,

Prof. Arize

View

0 Recommendations

Gilad Sabo

asked a question related to Multiple Time Series Analysis

What statistical analysis should I use?

Question

4 answers

Jul 25, 2016

Hi all,

Just wanted to ask If I test how IVs A,B,C,D, E, F predict depended variable Y and given that: .

A is point in time (before or after manipulation).

B is the group to which participants belong (control or experiment).

C and D are two measures of well being.

E and F are sex and age.

Y is measure for resilience.

Now, here is where it become a bit complex, I assume that after manipulation there would be increase in scores of C and D for the experiment group but not for the control group. Also, C and D scores will positively predict Y scores for both control and experiment groups. Of importance, I assume that C and D will predict the same or higher level of the variance in Y scores and that Y scores therefore would be higher for the experiment group after the manipulation (no change in the control). No special predictions for sex and age, they are very much covariates.

So, what analysis should I use?

Relevant answer

Mani Mehraei

Jul 25, 2016

Answer

Times series by using a software such as SPSS is helpful in this case.

My suggestion is to predict value of Y by using only C and D variables. (Parameter A can be define as time parameter. For example: Y(t) = C(t) + D(t)

You need to predict for control group and experiment group separately.

The same for parameters E and F. For E, you have only MALE and FEMALE, so it's easy to deal with it. But for age, depends on the variety of ages, you can divide them into groups. For example: 0-10, 10-20, and so on.

Thus, by using your data, you can obtain many information. Don't limit yourself in just one global formula to predict Y based on all mentioned parameters.

Good luck

View

5 Recommendations

Najibullah Hassanzoy

asked a question related to Multiple Time Series Analysis

Could you please share your experience about whether to run the Johansen Cointegration test on level, log-level or differenced series?

Question

10 answers

Mar 18, 2015

I would like to find the long-run relationship between domestic and international prices. I have run the Johansen Cointegration test on level with 2 and 6 lags, but I have got mixed results. With 2 lags the trace-test confirm one or more cointegrating vectors but trace-test does not confirm cointegration. However, with 6 lags suggested by AIC and HQ information criteria, their is no cointegration at all. How should I proceed further?

Relevant answer

Najibullah Hassanzoy

Jul 1, 2016

Answer

Yes this problem has already been solved with me. Generally speaking we convert all prices into log form and then conduct unit root and cointegration tests. However, it may not be wrong if one use the level form as well. I am sharing two of my papers below, if they are of any help to you:

http://www.foodandagriculturejournal.com/vol3.no4.pp27.pdf

https://ideas.repec.org/a/ebl/ecbull/eb-15-00640.html

View

0 Recommendations

José Walter Pedroza Carneiro

asked a question related to Multiple Time Series Analysis

Any advice on repeated measure analysis?

Question

3 answers

Nov 22, 2015

Studying time to seed germination under several temperatures we may consider a germination box, for example, as an experimental unit. In this individual space, I sowed 100 seeds of one species at the same time. These seeds start the imbibition immediately, but the chemical reactions inside every seed depend on the physiological quality of every one. Thus, we have germination events at t1, t2, …, tn inside this germination box (the experimental unit). In the experiment, we can have j experimental units for every k treatments. The question is: may the researcher analyze this data set using a routine for repeated measures along the experimental time?

Relevant answer

Fabienne Colas

Nov 23, 2015

Answer

Hi José,

as Spyridon said, you can not consider repeated measures. If you have only 100 seeds for each species, you should make 4 repetitions of 25 each, and use the mean, your result will be more robust. Pay attention to the diferences between the reps. For ISTA rules, you have to determine the difference of your mean germination with the max and the min, if it is too high, you can not, statiscally, use the mean. You have to begin again. See this link for more details.

Fabienne

Article Maximum tolerated differences between germination test repli...

View

20 Recommendations

Christiano Penna

asked a question related to Multiple Time Series Analysis

How do I conduct a Granger causality test using high frequency data?

Question

4 answers

Oct 20, 2015

The High Frequency requires an appropriate way to treat the data?

If yes, which is the appropriate methodology to do this?

Relevant answer

Akanda Wahid-Ul- Ashraf

Oct 29, 2015

Answer

Have you heard about convergent cross mapping(CCM)[1]? But , if your data are purely stochastic than it might give you any result.. but , some data might appear as stochastic , but it might be the case that the data generteting process is deterministic dynamic … in that cause granger causality is not reliable , and CCM is a really good solution.

1. Sugihara, G., May, R., Ye, H., Hsieh, C. H., Deyle, E., Fogarty, M., & Munch, S. (2012). Detecting causality in complex ecosystems. science, 338(6106), 496-500.

View

0 Recommendations

Paolo Mascetti

asked a question related to Multiple Time Series Analysis

How can I segment a Multivariate Time Series ?

Question

4 answers

Oct 19, 2015

I'm facing a multivariate time series composed by observations representing driving style collected each 0.1 s using Sensor fusion approach (mobile phone). The features are AccX,AccY,AccY, GyroX, GyroY, GyroZ,Speed. I'm trying different methods to segment the serie to obtain segments representing meaningful driving events (accelerations, breaking, steers). My first approach has been linear segmentation of individual time series but I'll prefer a multivariate approach.

Relevant answer

Olanrewaju Ismail Shittu

Oct 28, 2015

Answer

I agree with Hugh Kenedy. Try to implement his suggestions.

View

2 Recommendations

Naseem Ahamed

asked a question related to Multiple Time Series Analysis

Can someone help with the question related to binary time series cross sectional problem below?

Question

4 answers

Jun 4, 2015

Lets say that in a restaurant a chair is occupied by a customer. He can sit there for as long as he wants to depending on various factors like the ambience of the surrounding, quality of food, friendliness of staff etc. The time duration would be split into blocks of 15 mins i.e. every 15 mins a researcher would observe if he is still sitting there or has he left. The customer would be assigned a dichotomous value of 0 if he leaves and 1 if he still continues to occupy the chair.

Time Customer Sitting No. of servings/Food quality

7:00 pm 1 5/Good

7:15 pm 1 5/Good

7:30 pm 1 5/Good

7:45 pm 1 5/Good

8:00 pm 1 5/Good

8:15 pm 0

8:30 pm 1 3/Average

8:45 pm 1 3/Average

9:00 pm 0

9:15 pm 1 2/Poor

9:30pm 0

In the above example a customer arrives at the restaurant and sits there at 7:00 pm and remains there till 8:00 pm. After that he leaves and another customer occupies that chair and continues till 9:00 pm. Second customer leaves after that and third one arrives at 9:15pm and so on and so forth.

In this illustration the occupancy of the chair by a customer is the dependent variable taking values 1 or 0, and food quality/no. of servings would be independent variables.

I want to ask if logit and probit regression can be applied on such a problem. Is it a violation of criteria of independence of dependent variable that the value following "0" has to be "1". Can logit and probit regression be applied with some modifications, if yes what are those? Can logit/probit regression be applied on a time series data like this without any loss of generality?

Thanks in Advance

Naseem

Relevant answer

Ariel Linden

Jun 4, 2015

Answer

It accounts for dependencies based on the within-panel correlation structure that you specify. Here are the following choices of correlation structures that you can specify in Stata: exchangeable, independent, unstructured, autoregressive of order #, stationary of order #, nonstationary of order #, and a user-defined option.

View

2 Recommendations

Prakash Ap

asked a question related to Multiple Time Series Analysis

How should I select forecasting method when forecast outputs are too close to take a call?

Question

3 answers

May 16, 2015

Please find my dataset and forecast outputs attached.

A) First sheet contains March-2011 to February 2014 data and forecast for March-2014 to February 2015 using ARIMA,Winter's,TBATS and BATS method.It also has forecast errors obtained by comparing with actual output.

B) Second sheet has forecast for June 2015 to February 2016 using above mentioned methods.

C) R code.

As it can be seen, TBATS method gave output for 2014-15 with the least error but there is no trend and seasonality (Constant values) in TBATS output for 2015-16 which is hard to believe.

BATS method gave the most erroneous output (Constant values) for 2014-15 but forecast for 2015-16 seems reasonable.

I am confused which method should I go for.Should I opt for some other technique considering my data? Or Am I missing something?

DataSe
t.xlsx
24.21 KB

Relevant answer

Paul Louangrath

May 20, 2015

Answer

FORECAST HORIZON: you have data series that runs from March 2011 to March 2014. The proposed forecast is from March 2014 to February 2015, i.e. 12 months of forecast series. This is problematic. The forecast horizon is too long (12 months). you used monthly data and you try to project 12 months ahead. The forecast error may be too high. you should shorten the forecast horizon to a reasonable length.

ARIMA MODEL: Let's see the components of Autoregregressive integrated moving average (ARIMA); it consists of AR = autoregressive; plus there is an integration of exogenous shock effect that destroyed the original mean reverting---this requires you to verify "integration." MA = moving average; here, you need to verify that the error is N(0,1). ARIMA is the combination of AR + I + MA. Verify each part in the model selection.

REFERENCE: See Box-Jenkins materials.

Box-Jenki
ns.pdf
1.28 MB
The_Box-Jenkins_Meth
od.pdf
791.47 KB

View

3 Recommendations

Mustafa Baydogan

asked a question related to Multiple Time Series Analysis

Do you know any multivariate time series classification dataset with categorical variables?

Question

6 answers

Mar 2, 2015

Hi all,

Is there any multivariate time series classification problem in which some variables are categorical? The time series are assumed to have numerical observations for most of the approaches in this domain and I am interested in the case where some observations are categorical.

For example, network flow data consists of packages transferred between IP pairs. Each flow can be labeled by its application such as bittorrent, Skype and etc. Each flow is a series of packages for which the information about the size, direction and payload is known. Direction is either upstream or downstream in this particular example. Although it can be represented as a binary variable, the natüre of the variable is categorical.

Please, let me know if you have such datasets. Thanks in advance.

Relevant answer

Mustafa Baydogan

Apr 21, 2015

Answer

If anybody requires the type of dataset mentioned in this post, you can download it through my website. Network flow dataset in this repository stores the information about the packages sent by applications such as skype, bittorrent and so on. The aim is to predict the application type based on the set of packages in the flow.

The time series dataset stores the information about the direction of the package at a given time, time between packages, size of the package and size of the payload. This information is collected at Bogazici University by Subakan et. al. (Y. C. Subakan, B. Kurt, A. T. Cemgil, and B. Sankur. Probabilistic sequence clustering with spectral learning. Digital Signal Processing, 29(0):1 – 19, 2014.).

Here is the link to the study introducing this dataset: http://www.busim.ee.boun.edu.tr/~sankur/SankurFolder/DSP_SpectralLearning_Subakan.pdf

http://www.mustafabaydogan.com/blog/15-learned-pattern-similarity/75-new-multivariate-time-series-classification-datasets-are-added-to-files-section.html

View

12 Recommendations

Juan Carlos Saravia

asked a question related to Multiple Time Series Analysis

Modelling heterocedasticity in a Multilevel model using MLwiN, why can level 2 variance become non-significant?

Question

3 answers

Mar 18, 2015

Hello,

I am doing Multilevel modelling where my dependent variable is income and my dependent variables are age, type of job (categorical) (fixed effect) and sex (categorical) (fixed effect).

I have been modelling in MLwiN and everything has gone pretty good. I fitted a quadratic function in level 2 and everything was perfect.

My confusion occurred when I modelled level 1 variance. I modelled the variance and there is a clear heterocedasticity, both coefficients are significant. But, in that process my level 2 variance becomes none significant and my standard error increases a lot.

My only explanation is that the whole majority of the variance is within districts (level 1) instead of between districts (level 2). Therefore there is a lot of confounding variance across levels but the majority is in level 1.

Any suggestions for this?

Thank you very much!

Relevant answer

Sven De Maeyer

Mar 23, 2015

Answer

Another plausible explanation may be the fact that you increase the estimates of the total variance, resulting in higher Standard Errors. I would recommend to evaluate both models (with and without level 2 variance) on fit as well, by making use of AIC or BIC.

View

7 Recommendations

Mahdieh Askarian

asked a question related to Multiple Time Series Analysis

How are unknown parameters of dynamic linear model (state space model) specified?

Question

2 answers

Oct 19, 2014

The main idea is to use multivariable time series (as observations) to predict a state variable (one dimension).

Please find the attachments.

For example time series mm (4 variables and 200 observations) was used to learn V, W of a DLM. I have two question in this regard:

1) I supposed that dimensions of DLM should be as follows based on matrix operation:

FF = R^4×1

GG = R^1×1

V = R^4×1 (because the dimension of FF×Ө and V should be the same)

W = R^1×1

M0 = R^1×1

C0 = R^1×1

Therefore, the ¨V vector.R¨ code was developed. But, an error was displayed:

Error in dlm(FF = matrix(1, N, 1), GG = 1, V = matrix(exp(parm [1:4]), : Incompatible dimensions of matrices

Debug result:

m <- nrow(x$FF)

p <- ncol(x$FF)

if (!is.numeric(x$V))

stop("Component V must be numeric")

if (!(nrow(x$V) == m && ncol(x$V) == m))

stop("Incompatible dimensions of matrices")

Why V should be R^4×4 ?

2) The ¨V matrix.R¨ code was developed. But, following error was displayed.

Error in dlm(FF = matrix(1, N, 1), GG = 1, V = matrix(parm[1:16], N, N), :

V is not a valid variance matrix

What is the problem?

V matr
ix.R
415 B
V vect
or.R
412 B
M.csv
70.65 KB

Relevant answer

Kevin R Keane

Oct 22, 2014

Answer

A good reference for this is West and Harrison, Bayesian forecasting and dynamic models, Springer Verlag, 1997. It appears you are using West and Harrison notation or a variant of it. Chapter 9.2 discusses the “multiple regression dlm”.

I think that subchapter will clarify a few things. DLMs are defined by {F,G,V,W}. F is a 4x1 vector of independent variables. G = I (4x4 identity matrix), the system evolution matrix. V is the observation variance. W is the covariance matrix for the state parameter vector (regression coefficients), 4x4.

The variable predicted is the “observation variable”. West & Harrison do not call it a “state variable”. Rather, W&H refer to the current regression coefficients as the state vector or system vector Θ (subject to change over time).

Again, assuming W&H notation, the observation equation:

Y = F’ Θ + v , v ~ N( 0, V )

(1x1) = (4 x 1)’ (4 x 1) + (1x1) ; N[(1x1), (1x1)]

The system equation:

Θ_t = G Θ_t-1 + w, w ~ N(0, W)

(4x1) = (4x4) (4x1) + (4x1); N[(4x1),(4x4)]

I would recode things yourself, don’t use a library “dlm”. The equations are pretty easy (especially with W&H open next to you).

View

5 Recommendations

Mahdieh Askarian

asked a question related to Multiple Time Series Analysis

How many state variables should be selected for classification by HMM?

Question

11 answers

Sep 24, 2014

I have a multivariable time series database and label of each subset. It is intended to learn parameters of HMM (Hidden Markov Model) based on the data for classification.

At first, I select the label as an state variable. But, in this way the performance of classification by HMM is not good.

How I should select the state variables based on the presence database?

Relevant answer

Paul Louangrath

Sep 25, 2014

Answer

ISSUE: How variable state should be selected in Hidden Markov Model (HMM)?

HMM & SIMPLE MARKOV: Under Markov process, the output depends on state. The output is visible, thus, observable. State that produces output is not visible. Each state has probability distribution over all possible output. Therefore, the sequence of output generated by HMM provides information about sequence states. Recall that in simple Markov, the state is directly visible to observers. The only parameter is state transition probability. However, in Hidden Markov Model, the state producing the output is not directly visible, i.e. latent. Only the output that depends on state is visible.

Suppose that we have a data sequence: u = {u₁, u₂, ..., u_T}. In this case, there are 53 time series observations; therefore the last term is u₅₃. Now, every u_t is generated by a hidden state called S_t. This underlying (latent) state S_t follows a Markov chain, that is: (i) given the present, the future is independent of the past or simply that the future event depends on the most recent event. the distant past is of no consequence. Thus:

P(S_t+1 | S_t, S_t-1, ..., S₀) = P(S_t-1 | S_t)

From S_t-1 to S_t, there is a transition probability which may be summarized as:

a_ki = P(S_t+1 = I | S_t = k)

... where k,i = 1, 2, ... M where M represents a total number of states. The initial number of states is p_k. (reads pi sub-k). The total sum of the transition probabilities adds up to 1 and the total sum of p_k is also add up to 1. NOW, the states are:

S(S₁, S₂, ... , S_T) = P(S₂ | S₁)P(S₃|S₂) ... P(S_T | S_T-1)

This probability of p_k may be written as:

S(S₁, S₂, ... , S_T) = p_ka_s1s2a_s2s3 ... a_ST-1,S_T

For a given state ST, the observation u_t is independent of other observation and state.

HOW DOES IT WORK?

Markov Process: X₀........_A.......X₁......_A.......X₂......_A.......X_T-1

B B B B

Observations: O₀................O₁..............O₂ ............. O_T-1

What does it mean? A Markov process X₀ produces an observation O₀ via a hidden process B. Then moving from Xo to X₁, there is a transition process A. This A has its own probability called transition a_ki. Matrix A = {a_ki} is an N X N where

a_ki = P(state g_i at t+1 | state g_i at t)

A is row stochastic. Matrix B = {b_j(k)} is N X M with

b_j(k) = P(observed k at t | state g_i at t).

Thus, HMM is defined by A, B, p and dimension N and M. The formal statement for HMM is:

Lambda = (A, B, p)

NOW, consider a generic sequence X = (X₀, X₂, X₃) with corresponding observations O = (O₀, O₂, O₃). Then p_xo is the probability of starting in state X₀; that b_x0(O₀) is the probability of initially observing O₀ and a_xo,x1 is the probability of the transition for X₀ to X₁. Therefore, the probability P(X) is give as:

P(X) = p_x0bx0(O₀)a_x0x1(O₀)a_x1x2b2bx2(O₂)a_b3bx3(O₃)

In the present case, find P(53) by using the above generic sequence as a model. Extend the sequence to O₅₃.Your matrix A is 53 X 53 and matrix B is 53 X 20. Thus:

P(53) = p_x0bx0(O₀)a_x0x1(O₀)a_x1x2b2bx2(O2) ... ... a_b53bx53(O₅₃)

View

49 Recommendations

Doreen Sams

asked a question related to Multiple Time Series Analysis

Can anyone suggest appropriate SPSS analysis?

Question

6 answers

May 11, 2014

We have collected data from three dorms for two months (daily), we want to compare the control dorm (no treatment) to a dorm with one treatment (water saving ads) and a dorm with two treatments (water saving ads and eco-feedback shower heads). What is the best SPSS analysis to conduct and why? Are there any papers we can cite for this methodology?

Relevant answer

Huda A. Rasheed

May 13, 2014

Answer

You have three independent groups, so You can use one way analysis of variance (ANOVA) to compare the means of groups to test whether the differences between the three groups is significant or not.

If the calculated F is greater than tabulated F at alfa=0.05 (Sig. < 0.05) you can say their is a significant difference between the three groups and you should apply Multiple Comparisons as LSD (Least significant difference) or Tukey HSD or Scheffe to comparing between each two groups. But if the calculated F is smaller than tabulated F at alfa=0.05 (Sig. >= 0.05) you can say their is no significant difference between the three groups and you should end the analyzing. You can use SPSS easily for your analysis.

View

3 Recommendations