A semantic-based framework for summarization and page segmentation in web mining