Code: Detail

Especializações:Asp.net, Ajax, C#, Visual Basic.net, Wcf, XML FrameWork .net 1.1, 2.0 e 4.0, OO, Design Patterns Arquitetura SOA, DDD, MVP e MVVM Html, Css, JavaScript Sql Server 2000, 2005 e 2008 Oracle 8i e 9i Asp3, Visual Basic 6 e COM+

C# - (Portuguese)

Aplicar Encoding SOMENTE no texto do html

Abre um arquivo HTML e aplica encoding SOMENTE no que for texto IGNORANDO tudo que for HTML. Útil para qdo precisar aplicar encoding de algo como acentuação e a definição do charset não resolver

Last update: 31/08/2018
C#  net+core+2.0     
 
N/A
N/A
N/A
N/A
 

using System;
using System.Collections.Generic;
using System.Globalization;
using System.Text;
using System.Web;
using System.Text.RegularExpressions;



    public class HTMLDocument
    {

        public string teste()
        {
            
            string _conteudoArquivoPorLinha = string.Empty;
            StringBuilder _retornoConteudoTratado = new StringBuilder();
            StringBuilder _todoHtml = new StringBuilder();

            //Abre arquivo HTML
            //-------------------
            using (System.IO.StreamReader file = new System.IO.StreamReader(@"C:\Temp\Cartas\Modelo.htm", Encoding.GetEncoding("iso-8859-1")))
            {
                //Concatena todo o conteudo linha a linha na variavel _todoHTML
                //-------------------------------------------------------------
                while ((_conteudoArquivoPorLinha = file.ReadLine()) != null)
                    _todoHtml.Append(_conteudoArquivoPorLinha + " ");



                //Atribui 10 espaços antes e depois de cada tag html
                //--------------------------------------------------
                _todoHtml.Replace("<", "          <")
                         .Replace(">", ">          ")
                         .Replace("&nbsp;", "          &nbsp;          ");



                //Transforma em um trecho a cada 10 espaços jogando cada trecho em um array de trechos
                //------------------------------------------------------------------------------------
                var _trechosArray = _todoHtml.ToString().Split("          ");



                //Varre os trechos em busca de texto puro para aplicar encoding, sendo que trechos HTML serão 
                //ignorados
                //-------------------------------------------------------------------------------------
                foreach (var _trecho in _trechosArray)
                {
                    string _trechoTratado = string.Empty;

                    //SE _trecho NAO for uma tag HTML trata, caso contrario NÃO trata
                    if ((Regex.Match(_trecho.Trim(), @"<.*?>",RegexOptions.IgnoreCase).Success == false) && _trecho.Trim() != "&nbsp;")
                        _trechoTratado =  HttpUtility.HtmlEncode(_trecho.Trim()) + " ";
                    else
                        _trechoTratado =  _trecho.Trim() + " ";

                    _retornoConteudoTratado.Append(_trechoTratado);
                }

                file.Close();
            }



            return _retornoConteudoTratado.ToString();

    }

Source:
 
Users who have marked this routine as a favorite
 
 
 
The site ti4fun is not responsible for the content on sites for which you have external links

Articles, routines, tips, forums, blogs or any other content posted on ti4fun site is not tested and not validated, so you should test and validate any information collected on the ti4fun site before applying it to final use environment, such as example, production. the TI4FUN site is not responsible for quality or for any damages, direct, indirect or consequential, from use of any content posted by the authors in the site.

All content published on the ti4fun site is the responsibility of the author and do not necessarily express the views of the site ti4fun and its employees.