According to the literature reviewed in article (Makridis 2021), the following work is intended to analyze the data between employees from the private and public sector. This is done by taking into account different variables such as the monthly wage variable, monthly budget among others.
It should be noted that public sector employees gain less but they obtain a higher quality in non-wage benefits, that is, in their overall compensation, which generates a higher level of satisfaction in them.
The fact that workers assume fewer responsibilities in their jobs and their level of practice is much lower, which implies fewer opportunities to professionalize, influences the satisfaction variable.
According to Article (Muhammad Shahzad Chaudhry 2011), the satisfaction of the public salary is the result of certain compensation systems and of using the postulate “equal work equal pay” as the basis of a wage system of equity and justice.
We collected data from a sample of 100 participants. The following information was given:
The following key scientific questions will be addressed with this analysis:
Is there a difference between monthly salaries in the public and private sectors?
What is the difference between the monthly budgets of employees in the public and private sectors?
Is there a difference in the number of children between public and private sector employees?
Does job satisfaction differ between employees in the public and private sectors?
Summary statistics and visualizations will be used to explain the data. Continuous variables will be described in terms of mean, SD, minimum, maximum and inter-quartile range. Categorical variables will be described using frequencies and percentages.
Continuous variables will be tested for normality via the Shapiro-Wilk test of normality. The null and alternative hypothesis are defined as:
\[ H_{0} : \mbox{the variable follows a normal distribution} \]
\[H_{1}: \mbox{the distribution that the variable follows is not normal}\]
If there is evidence that the data is normally distributed a t-test will be performed to compare continuous variables between private and public employees. In order to determine which t-test is most appropriate, a variance ratio test will be conducted to compare the variances of continuous variables between private and public employees. If there is evidence that the variance are equal between the two groups, two-sample t-test will be used otherwise Welch t-test will be used for the comparison. If there is evidence that the data is not normally distributed, the non-parametric Mann-Whitney test will be applied to compare the variable between private and public employees.
In addition;
A two-sample Kolmogorov-Smirnov test will be used to compare the distribution of the continuous variables between private and public employees.
Categorical variables will be compared using the Chi-Square test.
Table 1 shows the summary statistics for the monthly salary. The average mean salary is 1644.34 (95% CI: 1590.36 to 1698.32); with the average mean salary for the private sector being 1857.21 (95% CI: 1789.33 to 1925.09) and for the public sector being 1483.74 (95% CI: 1435.33 to 1532.15).
Plot 1 shows key data visualizations to understand the overall distribution of the monthly salary and assess the normality of the data. Plot 1 A) shows the box-plot, B) shows the histogram with the theoretical normal distribution, and C) shows the QQ-plot for normality.
Based on the above visual inspections, there is evidence that the monthly salary follows approximately a normal distribution. Even tough the p-value is \(> 0.05\) (p-value\(=0.3995\)) with the Shapiro-Wilk normality test, we can not accept the null-hypothesis.
*Observation: Since the p-value is \(=0.3995\) which is highly significant, we can approximately conclude that the distribution of salary is normal (We can not confirm perfectly \(H_{0}\)).
Plot 2 shows key data visualizations to understand the overall distribution of the monthly salary and assess the normality of the data by employer type. Plot 1 A) shows the box-plot by employer type, B) shows the histogram with the theoretical normal distribution by employer type, and C) shows the QQ-plot for normality by employer type.
Based on the above visual inspections, there is evidence that the monthly salary follows approximately a normal distribution in both the private and public sector.
Plot 3 shows the the mean monthly salary with the 95% confidence interval for the private and public employees. Based on this visual inspection, there is evidence that there is a difference in the monthly salary, with mean salary being higher for private employees.
Table 4 presents the variance ratio test, with evidence that the variance monthly salary being similar between the private and public employees (p-vaue > 0.05). There is not enough to reject the null hypothesis, where
\[ H_{0}: \mu_{private} = \mu_{public} \] \[ H_{1}: \mu_{private} \ne \mu_{public} \]
Table 5 presents the two sample t-test, with evidence that there is a significant difference in the mean salary between private and public employees (p < 0.05). The mean difference is 373.47 (95% CI: 293.43 to 453.51) higher for private employees in relation to public employees.
The Kolmogorov-Smirnov plot shows that emperical cumulative distribution function of monthly salary for private and public employee is not the same. Table 6 shows that there is evidence that the distribution of the monthly salary is different between the private and public employees (p < 0.05).
Plot 5 shows the mean monthly salary with 95% CI error bars by employer type and employee satisfaction (see Table 7). Based on this descriptive analysis, the monthly salary is higher for employees in the private sector that are satisfy in their job than those that are not satisfy. The opposite phenomenon is seen in the public sector with employees who are not satisfied on average getting a higher salary.
Table 7 shows the summary statistics for monthly salary by employer and satisfaction. For example, the average mean salary for employees from the private sector and public sector that are “peu satisfait” is 1804.91 (95% CI: 1688.61 to 1921.21) and 1514.42 (95% CI: 1427.04 to 1601.8) respectively.
Based on the above visual inspections there is evidence that the budget data is normally distributed. Even though the p-value is \(> 0.05\) (p-value\(=0.1606\)) with the Shapiro-Wilk normality test, we can not accept the null-hypothesis.
Plot 1 shows key data visualizations to understand the overall distribution of the monthly food budget and assess the normality of the data. Plot 1 A) shows the box-plot, B) shows the histogram with the theoretical normal distribution, and C) shows the QQ-plot for normality.
In this section we test the normality of the salary in each of the public and private sectors:
The plot A shows that the average budget in the private sector (500) is higher than that of the public sector (400).
In this plot as in the above section, we first did some visual inspection to test normality. We observe that as all the points fall approximately along this reference line in both employment sectors. The same can be seen from the histogram plot B.
Table 1 shows the summary statistics for the monthly budget. The average mean budget is 428.66 (95% CI: 405.52 to 451.8); with the average mean budget for the private sector being 483.85 (95% CI: 447.1 to 520.6) and for the public sector being 387.02 (95% CI: 361.51 to 412.53).
Plot 2 shows key data visualizations to understand the overall distribution of the monthly food budget by the employer and assess the normality of the data. Plot 1 A) shows the box-plot, B) illustrates the histogram with the theoretical normal distribution, and C) illustrates the QQ-plot for normality.
The budget is not different between satisfaction levels in public and private. Overall, the budget is higher in the private sector.
Table 2 shows the summary statistics for food budget by employer and satisfaction. For example, the average mean budget for employees from the private sector and public sector that are “peu satisfait” is 481.06 (95% CI: 394.31 to 567.81) and 376.21 (95% CI: 331.75 to 420.67) respectively.
The Student test makes it possible to compare two sample means in order to see whether they are probably from the same population (null hypothesis H0) or, on the contrary, significantly different.
More simply, this test gives the probability that an observed difference is due to chance.
\[ H0 : \mbox{There is no significant difference between the mean of the two variables} \] \[ H1 : \mbox{There is a significant difference between the mean of the two variables} \]
##
## Welch Two Sample t-test
##
## data: Enquete$Budget by Enquete$Employer
## t = 4.3577, df = 78.976, p-value = 3.917e-05
## alternative hypothesis: true difference in means between group Prive and group Public is not equal to 0
## 95 percent confidence interval:
## 52.5999 141.0568
## sample estimates:
## mean in group Prive mean in group Public
## 483.8512 387.0228
We can say that we have a significant difference between the average of the two variables (Budget & sector of employer)
It means that the sector (employer) influences the budget of food
It can also be noted that the average food budget in the private sector is 483 while in the public sector it is 387 .
Table 1 shows the summary statistics for the number of children. The average mean is 1.46 (95% CI: 1.2 to 1.72); with the average mean for the private sector being 1.19 (95% CI: 0.81 to 1.57) and for the public sector being 1.33 (95% CI: 1.32 to 2.02).
In this section we test if the number of children has a normal distribution. From plots1 (A), (B) and (C) we can conclude that the number of children does not have a normal distribution, since:
Plot 1 shows key data visualizations to understand the overall distribution of the number of children. Plot 1 A) shows the box-plot by employer type, B) shows the histogram with the theoretical normal distribution by employer type, and C) shows the QQ-plot for normality by employer type.
The following table corresponds to the Shapiro-Wilk normality test for the number of children. Based on the data, the p-value is \(8.65645962943463e-9\), which is \(< 0.05\). This implies that the distribution of the data is significantly different from normal. To put it another way, we conclude with this test that the data is not normal.
Give that the data is not normally distributed, we use Wilcoxon test to make this analysis. We consider the following hypothesis:
\[ H0 : \mbox{There is no significant difference between the mean of the two variables} \] \[ H1 : \mbox{There is a significant difference between the mean of the two variables} \]
The p-value for the Wilcoxon test is \(0.0206\) i.e \(<0.05\), then we can say that we have statistically significant evidences to reject the null hypothesis, and thus we conclude that the means are different.
The following box plot shows that the employees from the public sector have in average more children than the one from the private sector. Also we can see a notably difference compare to the data salary, since here we have many outliers from both groups.
Plot 2 shows key data visualizations to understand the overall distribution of the number of children by employer. Plot 2 shows the box-plot by emplyoer type
The plot shows that there is a significant difference of satisfaction between the employers, we can see that in the public sector are “tres satisfait” compared to 23% from the private sector. However in the private sector, the majority of employees are “moyennement satisfait”.
Chi-Square test is a statistical method which used to determine if two categorical variables have a significant correlation between them.
The purpose of this part is to see if the sector of employment influences employee satisfaction, for this we used the chi-square test of independence .
\[ H0: \mbox{ The two variables are independent} \] \[ H1 : \mbox{The two variables are dependent} \]
##
## Pearson's Chi-squared test
##
## data: Enquete$Employer and Enquete$Satisfaction
## X-squared = 12.161, df = 2, p-value = 0.002287
Here, we conclude that we reject H0 and the employer influences the satisfaction of employees .
Observation: Since we have more than two groups to compare (peu safisfait, moyenne satisfait, tres satisfait ) we cannot use Wilxon or T-Student tests . In this situation (table 2 x 3 ), the ANOVA test is more accurate.
Moreover, note that for 2 x 2 table, the standard chi-square test in chisq.test() is exactly equivalent to prop.test() but it works with data in matrix form.
According to this study, the average salary and budget in the private sector are higher than those in the public sector. Both are influenced by the employment sector.
The employees from the public sector have in average more children that the ones from private sector.
It is evident that job satisfaction differs between the public and private sectors. Even though employees from the public sector are paid less than those from the private sector, we can see that nearly 50 percent of this population is “tres satisfait” compared to 23%.
Unfortunately, we can not have an advanced analysis of satisfaction, since we have three groups of data “peu satisfait”, “moyennement satisfait” and “tres satisfait”, and the documentation suggests an ANOVA test.