Science topic
Multiple Time Series Analysis - Science topic
Explore the latest questions and answers in Multiple Time Series Analysis, and find Multiple Time Series Analysis experts.
Questions related to Multiple Time Series Analysis
I'm using multiple time series measured daily from 2015 to 2021. but records for some days are missing for all time series. How can I impute the records for missing days?
I am using Mann-Kendall test and Sen slope to assess the trends in monthly rainfall datasets for 64 years, e.g., Jan 1957, Jan 1958, ..., Jan 2020. Since the region is a semi arid one, there are a lot of zero values (NOT missing values) in the time series. For example, the time series for rainfall in January has only 15 non-zero values out of 64 data points. My question is how this will effect the trend test (Mann-Kendall) and the trend slope (Theil-Sen)?
TLDR: How many variables can I have in a VAR or VECM model?
I am writing my thesis and I am using a VECM (VAR model with error correction for cointegration) model for analyzing the relationship between the prices of an energy exchange and some other factors. So far I have 4 variables in my model and I am thinking of adding more.
My question is that after how many variables does the model become unusable and unstable or can I add as much as I like?
Thank you for your answers in advance!
Dear Colleagues,
I estimated the OLS models and checked them for several tests; however, instability in CUSUMSQ persists as described in the photo. What should I do in this case?
Best
Ibrahim
![](profile/Ibrahim-Niftiyev/post/CUSUMSQ_issue_in_OLS_estimations/attachment/6040e76c4bdefa000153f3b5/AS%3A997635657838598%401614866284940/image/cusumsq.png)
Hi,
I am having trouble with a problem, in the field of Optimal Control and the generation of optimal time-series.
Let's consider a system, whose dynamics are represented by dx/dt = f(t,x(t),u(t),p(t)), x and u being respectively the state and control vectors for the system. p is a vector of parameters which have a direct influence on a system's dynamics.
An example illustrating this would be considering a drone, going from point A to point B, in minimum time, but subject to a windy environment (the wind being represented by the time-dependent variable p(t)).
I have generated, by solving an Optimal Control Problem, optimal time-series for x(t) and u(t), for several values of p=p(t)=constant.
I would now like to interpolate, for any given value of p(t) at time t, the "nearly-optimal" control u(t) to be applied to the system between time t and time t+1, based on the OCP results previously computed.
Would you know if this is even possible ? I have not really been able to find published work on this topic, if you had any suggestions I would be grateful.
Thanks,
Dear Colleagues,
If I have 10 variables in my dataset (time series) out of which 9 is explanatory and 1 dependent, and if I clarify that all the variables are non-stationary, should I take the first difference of the dependent variable as well?
Best
Ibrahim
Dear colleagues,
I am capable of using linear estimations between X and Y variables via OLS or 2SLS (on Eviews, for example); however, I need to study how to estimate/model non-linear relationships as well. If you know any source which can explain it in a simple language based on time series, your recommendations are well-welcomed. Thank you beforehand.
Best
Ibrahim
Hi,
When estimated the DCC-GARCH in stata at the end of the output pairwise quasi correlations are given. What does it mean in practice? is it the mean value of dynamic correlations or something else?
Much appreciated if anybody could clarify this.
Kind regards
Thushara
Hi
I've estimated a DCC-GARCH(1,1) model using STATA. at the end of the stata output, correlation matrix is given and it is also called quasi correlation matrix. Is it the conditional correlation matrix or a different one? if so is it the average/mean value of the dynamic conditional correlations?
Much appreciated if anybody clarifies this.
(I've herewith attached the output)
Kind regards
Thushara
What's the best open source (i.e., free) approach/library/tool for unsupervised/semi-supervised[i.e., with limited to no training data] time-series [like this - https://github.com/numenta/nupic/blob/master/src/nupic/datafiles/extra/nycTaxi/nycTaxi.csv] anomaly detection.
HI,
In a DCC-GARCH(1,1) model (dependent variable is first difference of logarithm of the series) based on monthly data,
1. How do you interpret unconditional and conditional correlation in a DCC-GARCH model?
2. Is it possible to get a correlation matrix (like the unconditional correlation matrix, without correlation for each month and pair) for conditional correlations? or we just need to present the data using a conditional variance graph for each pair?
Much appreciated your comments/advice on this.
Kind regards
Thushara
My clinical study measured the blood biomakers (glucose, insulin, glucagon, GLP-1, GIP, amino acids, etc) at baseline before an intervention meal and at multiple time points after an intervention meal.
We did this measure three times using three different intervention meals at three different days. My main objective is to compare whether there is any difference in the change in blood biomarkers after different intervention meals.
There are several AUC calculation methods to do this, such as the total AUC, incremental AUC (ignore the area under baseline) and net incremental AUC (subtract the area under baseline). How to determine which one to use and what is the rationale?
I am modelling volatility of international tourists arrivals from several source markets. I use mainly two methods ARIMA-GARCH or ARIMA-GJR models and SARIMA-GARCH or SARIMA-GJR models. Initially the estimates suggest that error terms of some models do not exhibit a normal distribution even though in the estimation I assumed it is normally distributed. I obtained the Bollerslev-Wooldridge standard errors in this case as it is said to be better than ordinary se. As some of the models do not have normally-distributed error term, I re-estimated all the models assuming a student-t distribution as it is recommended when the error term is non-normal. However, Bollerslev-Wooldridge standard errors were not available when using student-t distribution (In Eviews 10 software). Instead Huber-White se is available. I am wondering whether this is better than Bollerslev-Wooldridge standard errors or produce approximately similar outcomes. Any advice is much appreciated !!!
Hi,
can anyone recommend some literature and/or software for multi-level non-hierarchical dynamic factor models?
I am trying to compare two time series and I am in the process of assessing different methodologies to compare their relationship.
If you have used the Granger test, are you willing to share some literature on the topic please?
Hi!
We are trying to estimate body mass (W) heritability and cross-sex genetic correlation using MCMCglmm. Our data matrix consists of three columns: ID, sex, and W. Body mass data is NOT normally distributed.
Following previous advice, we first separated weight data into two columns, WF and WM. WF listed weight data for female specimens and “NA” for males, and vice-versa in the WM column. We used the following prior and model combination:
prior1 <- list(R=list(V=diag(2)/2, nu=2), G=list(G1=list(V=diag(2)/2, nu=2)))
modelmulti <- MCMCglmm(cbind(WF,WM)~trait-1, random=~us(trait):animal, rcov=~us(trait):units, prior=prior1, pedigree=Ped, data=Data1, nitt=100000, burnin=10000, thin=10)
The resulting posterior means of posterior distribution were suspiciously low (e.g. 0.00002). We calculated heritability values anyway, using the following:
herit1 <- modelmulti$VCV[,'traitWF:trait WF.animal']/
(modelmulti$VCV[,'traitWF:trai tWF.animal']+modelmulti$VCV[,' traitWF:traitWF.units'])
herit2 <- modelmulti$VCV[,'traitWM:trait WM.animal']/
(modelmulti$VCV[,'traitWM:trai tWM.animal']+modelmulti$VCV[,' traitWM:traitWM.units'])
corr.gen <- modelmulti$VCV[,traitWF.traitW M.animal']/
sqrt(modelmulti$VCV[,'traitWF: traitWF.animal']*modelmulti$VC V[,'traitWM:traitWM.animal'])
We get heritability estimates of about 50%, which is reasonable, but correlation estimates were extremely low, about 0.04%.
Suspecting the model was wrong, we used the original dataset with all weight data in a single column and tried the following model:
prior2 <- list(R=list(V=1, nu=0.02), G=list(G1=list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))
model <- MCMCglmm(W~sex, random=~us(sex):animal, rcov=~us(sex):units, prior=prior2, pedigree=Ped, data=Data1, nitt=100000, burnin=10000, thin=10)
The model runs, but it refuses to calculate “herit” values, with the error message “subscript out of bounds”. We’d also add that in this case, the posterior density graph for sex2:sex.animal is not shaped like a bell.
What are we doing wrong? Are we even using the correct models?
Eva and Simona
Hey, dears!
I am finding for someone mathematical model with chaotic burstings beyond the neural dynamics and don't have sucess.
Particularly, I have interested if (using couplings) it s possible force the Lorentz ou Rossler, for example, for exhibit this behavior.
I would like of receive suggestions of articles or ways for this.
Thank you.
Resgards,
I want to look at temporal variability of community composition..
Ive been reading about methods such as Redundancy analysis with principal coordinates of neighbourhood matrices (RDA-PCNM) or Asymmetric eigenvector maps (AEM) (Boccard & Legendre 2002, Legendre 2014).
Ultimately I want to:
1) plot a graph with X axis = time, Y axis = some 'univariate measure' of composition (I've seen Jaccard, or RDA x-axis score etc)
2) Calculate a 'univariate measure' of the temporal variability of composition (i.e a multivariate measure of Coefficient of Variation - CV) to plot against other x axis such as diversity etc..
My questions are
1) if you use RDA_PCNM or AEM -ie Boccard & Legendre 2002/Legendre 2014 methods; what is the need for conducting the PCNM first? Why cant you just use the RDA scores based on the original data?
2) An output of the RDA_PCNM can be the RDA x axis score plotted over time. But this still only shows you one dimension of the variability.. so you still need to plot the RDA y axis score too -
Isn't there some way to create a single 'measure' to incorporate the multidimensionality of possible composition changes.. such as the Euclidean distances between Time 0 and Time t for each year (perhaps from a PCNM) and plot that as the y-score?
What is the advantage of the RDA_PCNM or AEM methods over the Euclidean distance method?
3) to calculate a compositional measure of temporal variability; can I use the Euclidean distances as above and then calculate CV of these distances?
Thanks for your suggestions.
Data structure:
12 time points (not all sites sampled in all years)
environmental variable = habitat type
3 habitat types with a gradient of vegetation cover from A-C
3 replicate sites in each habitat
multivariate response variable = abundance data across multiple species (community)
Using Engle-Granger method, I found cointegrating relationship between my variables. Then I estimated long-run and error-correction models. Which model should I use to check main assumptions like normality, heteroskedasticity etc.?
Hi all,
Just wanted to ask If I test how IVs A,B,C,D, E, F predict depended variable Y and given that: .
A is point in time (before or after manipulation).
B is the group to which participants belong (control or experiment).
C and D are two measures of well being.
E and F are sex and age.
Y is measure for resilience.
Now, here is where it become a bit complex, I assume that after manipulation there would be increase in scores of C and D for the experiment group but not for the control group. Also, C and D scores will positively predict Y scores for both control and experiment groups. Of importance, I assume that C and D will predict the same or higher level of the variance in Y scores and that Y scores therefore would be higher for the experiment group after the manipulation (no change in the control). No special predictions for sex and age, they are very much covariates.
So, what analysis should I use?
I would like to find the long-run relationship between domestic and international prices. I have run the Johansen Cointegration test on level with 2 and 6 lags, but I have got mixed results. With 2 lags the trace-test confirm one or more cointegrating vectors but trace-test does not confirm cointegration. However, with 6 lags suggested by AIC and HQ information criteria, their is no cointegration at all. How should I proceed further?
Studying time to seed germination under several temperatures we may consider a germination box, for example, as an experimental unit. In this individual space, I sowed 100 seeds of one species at the same time. These seeds start the imbibition immediately, but the chemical reactions inside every seed depend on the physiological quality of every one. Thus, we have germination events at t1, t2, …, tn inside this germination box (the experimental unit). In the experiment, we can have j experimental units for every k treatments. The question is: may the researcher analyze this data set using a routine for repeated measures along the experimental time?
The High Frequency requires an appropriate way to treat the data?
If yes, which is the appropriate methodology to do this?
I'm facing a multivariate time series composed by observations representing driving style collected each 0.1 s using Sensor fusion approach (mobile phone). The features are AccX,AccY,AccY, GyroX, GyroY, GyroZ,Speed. I'm trying different methods to segment the serie to obtain segments representing meaningful driving events (accelerations, breaking, steers). My first approach has been linear segmentation of individual time series but I'll prefer a multivariate approach.
Lets say that in a restaurant a chair is occupied by a customer. He can sit there for as long as he wants to depending on various factors like the ambience of the surrounding, quality of food, friendliness of staff etc. The time duration would be split into blocks of 15 mins i.e. every 15 mins a researcher would observe if he is still sitting there or has he left. The customer would be assigned a dichotomous value of 0 if he leaves and 1 if he still continues to occupy the chair.
Time Customer Sitting No. of servings/Food quality
7:00 pm 1 5/Good
7:15 pm 1 5/Good
7:30 pm 1 5/Good
7:45 pm 1 5/Good
8:00 pm 1 5/Good
8:15 pm 0
8:30 pm 1 3/Average
8:45 pm 1 3/Average
9:00 pm 0
9:15 pm 1 2/Poor
9:30pm 0
In the above example a customer arrives at the restaurant and sits there at 7:00 pm and remains there till 8:00 pm. After that he leaves and another customer occupies that chair and continues till 9:00 pm. Second customer leaves after that and third one arrives at 9:15pm and so on and so forth.
In this illustration the occupancy of the chair by a customer is the dependent variable taking values 1 or 0, and food quality/no. of servings would be independent variables.
I want to ask if logit and probit regression can be applied on such a problem. Is it a violation of criteria of independence of dependent variable that the value following "0" has to be "1". Can logit and probit regression be applied with some modifications, if yes what are those? Can logit/probit regression be applied on a time series data like this without any loss of generality?
Thanks in Advance
Naseem
Please find my dataset and forecast outputs attached.
A) First sheet contains March-2011 to February 2014 data and forecast for March-2014 to February 2015 using ARIMA,Winter's,TBATS and BATS method.It also has forecast errors obtained by comparing with actual output.
B) Second sheet has forecast for June 2015 to February 2016 using above mentioned methods.
C) R code.
As it can be seen, TBATS method gave output for 2014-15 with the least error but there is no trend and seasonality (Constant values) in TBATS output for 2015-16 which is hard to believe.
BATS method gave the most erroneous output (Constant values) for 2014-15 but forecast for 2015-16 seems reasonable.
I am confused which method should I go for.Should I opt for some other technique considering my data? Or Am I missing something?
Hi all,
Is there any multivariate time series classification problem in which some variables are categorical? The time series are assumed to have numerical observations for most of the approaches in this domain and I am interested in the case where some observations are categorical.
For example, network flow data consists of packages transferred between IP pairs. Each flow can be labeled by its application such as bittorrent, Skype and etc. Each flow is a series of packages for which the information about the size, direction and payload is known. Direction is either upstream or downstream in this particular example. Although it can be represented as a binary variable, the natüre of the variable is categorical.
Please, let me know if you have such datasets. Thanks in advance.
Hello,
I am doing Multilevel modelling where my dependent variable is income and my dependent variables are age, type of job (categorical) (fixed effect) and sex (categorical) (fixed effect).
I have been modelling in MLwiN and everything has gone pretty good. I fitted a quadratic function in level 2 and everything was perfect.
My confusion occurred when I modelled level 1 variance. I modelled the variance and there is a clear heterocedasticity, both coefficients are significant. But, in that process my level 2 variance becomes none significant and my standard error increases a lot.
My only explanation is that the whole majority of the variance is within districts (level 1) instead of between districts (level 2). Therefore there is a lot of confounding variance across levels but the majority is in level 1.
Any suggestions for this?
Thank you very much!
The main idea is to use multivariable time series (as observations) to predict a state variable (one dimension).
Please find the attachments.
For example time series mm (4 variables and 200 observations) was used to learn V, W of a DLM. I have two question in this regard:
1) I supposed that dimensions of DLM should be as follows based on matrix operation:
FF = R4×1
GG = R1×1
V = R4×1 (because the dimension of FF×Ө and V should be the same)
W = R1×1
M0 = R1×1
C0 = R1×1
Therefore, the ¨V vector.R¨ code was developed. But, an error was displayed:
Error in dlm(FF = matrix(1, N, 1), GG = 1, V = matrix(exp(parm [1:4]), : Incompatible dimensions of matrices
Debug result:
m <- nrow(x$FF)
p <- ncol(x$FF)
if (!is.numeric(x$V))
stop("Component V must be numeric")
if (!(nrow(x$V) == m && ncol(x$V) == m))
stop("Incompatible dimensions of matrices")
Why V should be R4×4 ?
2) The ¨V matrix.R¨ code was developed. But, following error was displayed.
Error in dlm(FF = matrix(1, N, 1), GG = 1, V = matrix(parm[1:16], N, N), :
V is not a valid variance matrix
What is the problem?
I have a multivariable time series database and label of each subset. It is intended to learn parameters of HMM (Hidden Markov Model) based on the data for classification.
At first, I select the label as an state variable. But, in this way the performance of classification by HMM is not good.
How I should select the state variables based on the presence database?
We have collected data from three dorms for two months (daily), we want to compare the control dorm (no treatment) to a dorm with one treatment (water saving ads) and a dorm with two treatments (water saving ads and eco-feedback shower heads). What is the best SPSS analysis to conduct and why? Are there any papers we can cite for this methodology?