Cost Estimation Aided Software for Machined Parts: An Hybrid Model Based on PLM Tools and Data

. For each manufacturer, exact cost estimation is both a major priority and a challenge. This routine task is far from being optimized and depends on a very large number of parameters both strategic and technical. To help estimate a cost, applications have been developed but estimation is a task that would great-ly benefit from the re-use of knowledge and data found in alphanumeric and geometric documents. Every manufacturing company has switched to numerical data with CAD, CAM, or ERP systems but one of the main drawbacks is the low usage of all these information that contain valuable knowledge and expertise. This paper describes the current state of cost estimation and proposes a new hybrid approach whose purpose is to maximize the re-use of information for machined parts. Our approach is based on a parameterized and customized cost model, an extractor of semantic descriptors in geometric documents (Model Based Definition files) and related textual documents and finally correlations to adjust the cost of machined parts.


Introduction
As the competition is fiercer, companies have to stay competing while also being able to deliver and produce high quality products in a fair time.Having a reliable and precise cost estimation is crucial for this, it will limit the risks of overheads, allow a better production and budget management and help making better strategic decisions.As the use of CAD software has allowed enormous time savings for design [10], cost estimation represents a task which would greatly profit from a computer application to produce faster and more accurate quoting for both the clients and the contractors.This task is often entrusted to one or two experts who will rely on both tacit knowledge acquired from experience and explicit parameters or standard guidelines [18].Software applications such as Apriori or Techniquote are sometimes used but they present limitations, their database has to be up to date, they need to be adapted to the different processes of the company and they often require a lot of time to estimate.However, companies struggle to obtain repeatable price for the same type of machined parts through time because it is the combination of knowledges being both tacit and explicit regrouping the capacity of the company, the type of part and their features as well as the human and financial aspects.
The main objective of this paper is to propose a methodology for the cost estimation which can adapt to the resources and capacity of the company.Tacit knowledge remains a master piece in the machined part cost estimation process that we need to understand and integrate.Therefore, we suggest an innovative approach based on similarity based-knowledge in order to develop models of customized costs which rely on the massive re-use of former data related to similar parts as an alternative to this tacit knowledge.
The rest of the paper is organized as follows: Section 2 focus on the literature review in cost estimation, model based definition and part similarity assessment.The Section 3 will present the observations that have been made in the industry as well as our hypothesis for the model.In section 4, we describe our proposed methodology, present our contribution and illustrate the model with a simple example.In section 5 we will discuss some perspectives of work and conclude the paper.

Literature Review
Cost estimation has been an alive nerve of industry for almost a century because it is a primordial stage in a product life cycle [5].Thus, one can find a broad panoply of methods and techniques, each one depending on available information, type of parts, materials or industrialization stage for example [8].As described by Adnan Niazi and Al [15] and Ben-Arieh [2], all of these methods can be divided into two groups, the quantitative and the qualitative ones.Each of them is then separated in two classes, and inside those subdivisions other categories can be made.This is also described in [16].However, every method has its advantages, its field of predilection, its limits and is better suited for a specific time in the product life cycle, as illustrated in [8].

Parametric techniques:
These cost estimation techniques were widely used because they often produce accurate estimation, and are easy to implement, but cost drivers needed to be correctly identified, limiting each method to one type of parts and making it prone to errors when parts were changing too much.Even if [6] show that these techniques have been improved, some limitations are still present and are innate to the parametric approach.

Analytical techniques:
They regroup models based on cost-tolerance, activities, features or operations.Cost-tolerance models are almost all based on the quality-loss function proposed by Taguchi [21] and curve fitting techniques [20].Machining or form features based models such as presented by Feng and al. [7] or Xu Xinsheng and al. [24] tends not to take into account all the processes and does not consider the impact of tolerances for example.Moreover, the enormous data requirements for these techniques restrict their use for the final phases of industrialization process.

Analogical techniques:
These methods rely more on regression analysis or neural networks in order to find relationships between cost and a selected set of variables.They are good to deal with non-linear problems and to adapt but they are data-dependent and more difficult to develop or implement [16].

Intuitive techniques:
On the other hand, intuitive techniques, such as rule-based or knowledge-based system [19] are often quicker and handles very well uncertainty.However, they are also hard to keep updated, limited to a sort of design and the implementation often requires too much effort as well.Case based reasoning address some of these issues and is one of the most used techniques nowadays [8].The main drawbacks of all the CBR techniques are the necessity to abstract features or parameters in order to compare parts, resulting in techniques that are applicable to one type of design and the need for past designs.
As you can see a large number of models have been developed for various kinds of applications, each one having its advantages and the limitations as shown in [15].But the existing models often rely too much on mathematical models and are not prone to be customized to the needs of a given company.Recent research papers show a new trend to get quicker and more precise results by combining several approaches as shown in [11].As studied in [9], theses new methods provide more promising results and are also increasingly based on the new capabilities of data treatments, semantic researches or machine learning for instance [6][7][8][9][10][11][12][13][14][15][16][17][18].In deed, the last decade has shown some interesting projections for the treatment of great volumes of non-structured data.Digital systems like, computer-aided design (CAD), or ERP (enterprise Resources Planning) have become widely used [1].But the interoperability between these applications is almost none existent, information is dispersed in innumerable files resulting in duplicates and inconsistency.The problem is even acuter on the level of mechanical engineering where a significant portion of information is locked up in the 3D geometry.The MBD including the 3D model as well as the PMI makes it possible to have a complete representation of the part in a single file.Moreover, these crucial data are complex to retrieve from the design drawings and the extraction would be easier from a MDB.According to Quintana and Venne [17][18][19][20][21][22], companies in aeronautics, automobiles or software preach this practice aiming to decrease design times and gather numerically all information at the same place.
This proliferation of digital information created the need for search engines and there exist nearly fifty company search engines made by Oracle, SAP, Dassault for example.However, they are limited to the indexing of alphanumeric items.Several approaches for indexing 3D models were proposed in scientific articles [3], but few applications crossed the stage of the university research.Yet some software like 3DPartFinder are able to index parts directly by the boundaries representation [14], characterized by a higher degree of accuracy, an index much more concise and improved performances.As pointed out in [14][15][16][17][18][19][20][21][22][23][24], similarity and reuse of knowledge are an under-exploited domain, over 70% of the customized product can be made out from the existing product design resources for example, which comfort the interest to invoke similarity.

3
Industrial observations and hypotheses

Observations
In order to verify and put to the test what the literature taught us, several meetings and discussions were conducted with the different companies involved in the project.Therefore, after multiple visits and cost estimations, we could make several reports, some of them matching what was seen in the research studies and some not: • There exist at least two methods and the choice of the methods depends essentially on the available time, the overall complexity or the added-value of the part.• The model is always affected by strategic factors and selling cost is not the same as pricing cost • It always remains a significant proportion of tacit knowledge inversely proportional to the allocated time • Historical data and reuse of knowledge are underused

• They only have few options to reduce specific costs
We also made some observations concerning the models.As shown in the equation (1), the majority divide the cost into 3 or 4 distinct categories: material cost, machining cost, post-processing cost and various cost like margins, tools, hardware, assembly.This is an approach [5][6][7][8][9][10][11][12][13] brought up in their studies.The practices are also often based on the same approaches but with some variations: • Using a volumetric approach to raw material cost and machining time • Submitting tenders in order to know the exact prices • Estimating the machining time according to the operations and the machines used • Sorting of the parts and using historical data when possible From what we observe there are some gaps between research models and the industry how-to and thus the need to come up with a different approach.The companies seek to maximize their productivity, estimating as fast as possible in order to process a maximum of requests for quote.Moreover, even if the time granted to quote a part should depend on its added-value, companies often allows the minimum necessary to estimate.Therefore, an adequate model should be presented being both simple enough, quick and putting the emphasis on the reuse of valid knowledge and the reliability of the cost.The objective would be to reduce the potential error and make it possible to better understand from which parameters the estimation comes while offering a variable level of detail depending on what the user is looking for.The Fig. 1 tend to show the contribution of our final model.

Hypothesis
Based on our observations and the state of the art we can draw some hypothesis for our current work listed below [12][13][14]: • Cost depends on several parameters with different levels of influence.
• Geometric similarity implies other correlations such as similar machining, material, or processes and parts with similar information tend to converge towards the same cost value.• The more similar part we have, the higher degree of confidence we get on a cost estimation.

Overview
The complete methodology has been designed to answer the research questions.It composed 4 groups and 6 steps (see Fig. 2).The first one aim to understand the environment of the company, to identify the key parameters, the way of proceeding, and to quantify the tacit knowledge.The second one is the design of a generalized model which will be detailed in the present section.The third one consists in merging all the information obtained during the first step with the model.Finally, the last group is for evaluating the final model and validating the benefit of the geometrical similarity.

4.2
The generalized model

Problem Description
Based on our observations, companies estimate costs using comparable methods, some parameters are common between them, others are specific to the enterprise.We can then divide each problem into two entities: An object P (in costing, it represents the part) and its environment E (illustrated by the company).

The parameters and functions
Each of these entities will also be defined by a number of parameters   (see equations (2) and (3)) and a set of functions: = (  ) ℎ  ∈ {, . ., } We can then classify the parameters and the functions used to calculate the costs into two categories as illustrated in Table 1.Nevertheless, the functions can be anything from a sum, to linear regression and are either exact functions or approximation depending on the company.They are used to break-down a parameter into other ones, more detailed.The number of parameters will be determined by the necessary levels of decomposition to obtain an acceptable result and by the process of the enterprise specifying the prevalence of certain parameters.The more parameters there are, the more precise the refinement can be.We then have a tree with   -Layers of decomposition depending on the top parameter.In theory, each parameter can be refined until it is exact or coming from a database, but in practice it might not be possible or necessary.The decomposition stops when no function is found to break down the considered parameter or when it is not considered profitable.As an example, a length or a density cannot be further decomposed, whereas features machining time can but it might be too complicated to implement.The challenge is to find a suitable balance between the level of decomposition, its practicability and its benefit.The Notion of source of the parameters Decomposing a parameter is not the only way of obtaining a value for it.As a consequence, we have decided that the provenance of a parameter is almost as much important as its value and have distinguished five different possible sources: exact, database, tacit, similar or calculus.
Exact: Parameter is taken from the MDB or databases that are not prone to change in time.A density or an area extracted from the 3D model are examples of exact source.Database: Parameter is coming from a database, but the value can fluctuate over time like a price of material.Tacit: The value is entered by the user depending totally on his personal knowledge.Calculus: A parameter   can be broken down into   ,   using a specific function.Similar: The value is predicted either by finding a relationship between two or more correlated parameters with a regression for example or by using a set of rules.Data used to build the model are only taken from geometrically similar parts and a similarity index is defined between the new part and old ones allowing to keep the most similar and to weigh the influence of each part within the relationship.

The Reliability and Precision
Another major point is that an estimation has to be reliable and precise.To address this matter, we propose to attribute a reliability coefficient scaling from 0 to 1 to each parameter depending on its source and an absolute error.The reliability coefficient represents the confidence we have in the value, and the absolute error represents the potential error we made on the value.The reliability coefficient for the calculus or similarity provenances will vary depending on the RC of the parameters used and on their correlation.In case of a simple multiplication or fraction calculus, the reliability coefficient will behave as combined probability of two independent events.For similarity, we will use the Wang and Stanley composite reliability formula between two parameters [23] in order to determine the composite RC that will be adjusted depending on the number of similar parts used, their seniority and the difference of batch size.The absolute error is taken from the computer tools or the measurement methods.For the calculi and the similarities, we will use partial differential method and weighted mean absolute error.

Geometric similarity and priority
The model allows to emphasize some sources depending on the reliability coefficient or where the data is available.The source with the highest reliability is prioritized if possible, even if the user has always the decisive verdict and can overwrite any value using his knowledge.The model allows us to get values from different origins for the same parameter.The Fig. 3 shows an example of decision tree about these choices that can change from parameter to parameter.The other main contribution is the possibility to use the geometric similarity between parts as an alternative way to obtain parameters which are too complicated to decompose or depend too much on tacit knowledge [4].By finding similar parts we can then adapt any given parameter.This allows us to transcribe the tacit knowledge to be able to reuse it.The model rests on the assumption that geometrically similar parts will likely serve the same function, leading to similar characteristics, letting us determine analogically some variables and costs which are not calculable with a parametric technique.Furthermore, to refine the similarity search we can filter the parts by material for example, in order to have fewer chances of exceptions.

An example with the cost of the raw material
In order to illustrate the process of the generalized model, we decide to take a simple example to calculate the cost of the raw material for a CNC part.Let's say enterprise 1 machine aluminum part for motorcycle market.Here is the list of parameters for the part and the enterprise, equation ( 4) and ( 5):  = (  , ,   ,   , , ) The fictive enterprise has a database linking material, density and price as shown in Table 2 below (densities are correct but prices are fictive): Using our previously described method, a possible decomposition scheme is detailed in Erreur !Source du renvoi introuvable..We decide to use as much as possible exact data and use similarity to interpolate the raw volume.Indeed, we suppose that the raw volume depends on many other parameters like the volume of the bounding box, the type of machine, machining strategy like near net shape or picture frame.We then expect to find these relations encapsulated inside the past machined similar parts.As a comparison, the company has its own function to calculate the raw volume where  is a tacit factor going from 1.05 up to 1.15 (to add 5 to 15% of the volume) with a reliability factor of 0.7.

Fig. 4. Decomposition scheme for the example
We have to find the cost of the raw material of the part 4, represented as the blue part.The Table 3 shows the results of the similarity query and the values of the different parameters as well as their origin and reliability coefficient.The RC of the different volume is assumed to be close to 1 as it is taken from the MBD itself.The volume of the raw is the real value measured by operators, thus we assume the RC to be 0.99 with a negligible absolute error depending on the measurement tools.The RC of the price is set to 0.95 as it is a parameter fluctuating in time, and we assume an absolute error of 0.1 if the database has not been updated recently.
Table 3. Parameters of the similar parts and results of the similarity query (ref in blue) We have used similarity and calculus to find the volume of the raw material for the new part 4, as well as its cost.Similarity is carried out using a weighted linear least squares regression analysis between the volume of the bounding box and the volume of the raw material.The similarity index serves to weight the regression in order to minimize the error estimate.We have then compared both of the results with the real cost that we know in order to evaluate the potential of our methodology.Fig. 5 display the outcome of the regression and Erreur !Source du renvoi introuvable.show the resulting comparison.
Our approach shows promising result when used for the cost of the raw material which is already a consistent part of the total cost for a CNC machined part.We manage to reduce the absolute error from $2 to $1.18 on a $28 price, but more important we manage to have a really higher reliability coefficient on that estimation.The tacit factor K is depending on the experts, whereas, using similarity we are able to estimate the raw volume more precisely with a RC up to 0.897 against 0.7.Thus, this allows us to be both closer to the real price ($28.2estimated with similarity versus $28 real) and more confident with an RC of 0.85.Furthermore, this reliability can be easily increased if we choose parts with a higher similarity index.The methodology helps to evaluate and measure the quality and the confidence of the estimation which both are key factors for the quotation process.The first results obtained with our methodology are very encouraging, therefore more experiments and estimations will be conducted in order to suggest a complete cost estimation model for the different companies and to enhance our possible outcomes.4. Comparison between the similar method and the approximate calculus of entreprise 1

5
Conclusion and future work

Conclusion
In this paper, we proposed a new generalized methodology for cost estimation in the case of CNC machined parts.We identified the different models in circulation in the industry and suggested a new approach to deal with the tacit knowledge that is always a major factor during the estimation process.Similarity can be used as an al-    ternative method to obtain precise and valuable information about the cost of a product.It allows to reduce considerably the risk of errors, and increase the fidelity.Accordingly, this proposal shows the importance of knowledge contained within the companies and how many aspects derive from a part's geometry.Hence, the advantages of similarity, for some specific tasks, clearly appears, as stated in this paper.

Future Work
Our future work concerns the development of a standardized cost estimation model for each aspect of a machined part price (material, machining, post processing, and so on).By talking more with the experts, tracking and benchmarking some parts, we aim to refine further the parameters, functions and models used for different calculi and better integrate and use the geometrical similarity.We also plan to test further our model in some real-world scenarios to extend its validity and address any potential issues.

Fig. 1 .
Fig. 1.The contribution of the project

Fig. 3 .
Fig. 3. Decision tree depending on the source

Fig. 5 .
Fig. 5. Results of the weighted least square regressionTable 4. Comparison between the similar method and the approximate calculus of entreprise 1

Table 1 .
Function and parameter types with example

Table 2 .
Density and price database example